Closed mathigatti closed 4 years ago
Hi @mathigatti This error is because you have different lengths of audio in a single batch. As the error says, the data loader got "20546 and 9168 in dimension 2". You need to crop them to have a same length.
Before that, let me check this first. It looks like you are trying to use Mel spectrogram inputs. Implemented models in this repository use raw audio inputs and it extracts Mel spectrograms on-the-fly. So, please use raw audio inputs.
If you want to use Mel spectrogram inputs with your own data loader, you need to modify model.py
. You can simply remove self.spec
and self.to_db
from model.py
.
It worked perfectly after downloading the mp3 files thank you very much!!
Hi, thanks for this awesome project! I'm trying to train it with jamendo-moodtheme tags but I'm getting an error.
I'm trying it on a google colab VM with a cuda enabled GPU.
I downloaded the mel-spectrograms from the jamendo repository specyfing the melspecs data type and autotagging_moodtheme dataset. Then in this project I just replaced the TAGS variables in the code with this and the tsv files with the moodtheme ones from here.
Everything looked fine but for some reason I'm receiving the attached error after running the training code.
The mel spectrograms have 92 bands and different lengths, that might be causing problems maybe?
Let me know if anyone knows what might be the problem :)
Thanks in advance!
My error message