Closed ChenyangLEI closed 2 years ago
Hi @ChenyangLEI / @daisukelab ,
I have a similar question, the norm in the acoustic world is to use [T,F], however, BYOL-A uses [F,T]. Any specific reason?
Hi @ChenyangLEI, Thank you for sharing the issue, It's my fault. As you might know, the comment in train.py is wrong. config.yaml is correct.
Hi @Sreyan88, I didn't aware that the [F, T] order is against the convention. It's basically following the output feature shape of the byol_a.dataset.MelSpectrogramLibrosa. Thanks for sharing this issue with me. I'd like to switch to it in the future. :)
https://github.com/nttcslab/byol-a/blob/master/train.py
At line 67, there is comments for the shape of input.
However, it is different from the descriptions in config.yml file