I am now having a taste of PANNs by looking at example.py. However, when I was trying with some short audios, I found I have to do padding to make it work, otherwise below error would occur. I randomly padded by audio length to 10000 and it worked. But I know it is just a placefolder.
RuntimeError: Given input size: (1024x1x4). Calculated output size: (1024x0x2). Output size is too small
Since we do re-sampling at 32Khz, I wonder:
What is the minimum length of the audio to be input into the model?
If the input is shorter than that, is it valid to call for example numpy.pad(audio, (0, shortage), 'wrap') to pad the audio?
Hi first of all thanks for the amazing work!
I am now having a taste of PANNs by looking at
example.py
. However, when I was trying with some short audios, I found I have to do padding to make it work, otherwise below error would occur. I randomly padded by audio length to 10000 and it worked. But I know it is just a placefolder.Since we do re-sampling at 32Khz, I wonder:
numpy.pad(audio, (0, shortage), 'wrap')
to pad the audio?