vocalpy / hybrid-vocal-classifier

a Python machine learning library for animal vocalizations and bioacoustics
http://hybrid-vocal-classifier.readthedocs.io
BSD 3-Clause "New" or "Revised" License
23 stars 8 forks source link

dpss window in Spectrogram class uses wrong value for NW #91

Open NickleDave opened 3 years ago

NickleDave commented 3 years ago

the Spectrogram class in audiofileIO allows the user to specify a window, which can be one of `{None, 'Hamm', 'dpss'}.

The dpss option is supposed to reproduce the behavior from Koumura Okanoya 2016

In computing spectra, the 0th order DPSS with a parameter W = 4 / 512 was used as a taper [45]. I believe this is implemented in the code somewhere here, note the taper argument: https://github.com/cycentum/birdsong-recognition/blob/5b28e57ad6b20b5669e792f9b0abf43a36a2e5ea/birdsong-recognition/src/utils/SoundUtils.java#L95

in fixing issues caused by updating to newer versions, I noticed that the slepian function I used is now deprecated. It also seems there were some bugs in the implementation, see https://github.com/scipy/scipy/issues/4354

which makes me think I was never really generating this taper the right way, anyway

to truly fix this, I need to figure out what the right value for NW should be, the standardized half-bandwidth. Not clear to me if W parameter that Koumura Okanoya 2016 is the full bandwidth, think I would need to actually look at the textbook they cite (also cited now by the scipy.signal.windows.dpss docs). Matlab docs give slightly more detailed equations for how they are related: https://www.mathworks.com/help/signal/ref/dpss.html

for now, just to prevent crash on import, I am going to change to the replacement dpss function

It's probably the case that neither I or anyone else is using this window anyway

but strictly speaking it's using the wrong value and should be fixed