qiuqiangkong / panns_inference

MIT License
197 stars 31 forks source link

What is the minimum input audio length/size of AudioTagging class? #13

Open underdogliu opened 1 year ago

underdogliu commented 1 year ago

Hi first of all thanks for the amazing work!

I am now having a taste of PANNs by looking at example.py. However, when I was trying with some short audios, I found I have to do padding to make it work, otherwise below error would occur. I randomly padded by audio length to 10000 and it worked. But I know it is just a placefolder.

RuntimeError: Given input size: (1024x1x4). Calculated output size: (1024x0x2). Output size is too small

Since we do re-sampling at 32Khz, I wonder:

  1. What is the minimum length of the audio to be input into the model?
  2. If the input is shorter than that, is it valid to call for example numpy.pad(audio, (0, shortage), 'wrap') to pad the audio?