KinWaiCheuk / nnAudio

Audio processing by using pytorch 1D convolution network
MIT License
1.02k stars 89 forks source link

CQT doesn't work on waveforms on short chunks like 0.5s #109

Open kkp15 opened 2 years ago

kkp15 commented 2 years ago

CQT doesn't work on waveforms on short chunks. Is this something expected?

KinWaiCheuk commented 2 years ago

May I know if you are using CQT1992? It is an expected behavior especially when you include low-frequency bins. Because for CQT, in order to maintain a constant Q (the same number of wavenumbers), the kernels for low frequencies will be much longer than the kernels for high frequencies. When the kernel for the lowest frequency bin is longer than your waveform, you will get this error.

There are two solutions (1): exclude low-frequency bins by setting fmin>=220. But this might not be a good solution if you need the low-frequency information (2): Use CQT2010 instead of CQT1992. CQT2010 uses the downsampling method to prevent the above problem from happening. But it might introduce some artifacts to your input signal.

kkp15 commented 2 years ago

Thank you for your answer. I'm using CQT1992v2 right now. Is there any way I can calculate the minimum length based on the CQT hyperparameters? Thank you.