Closed maetshju closed 4 years ago
Someone please merge this!
I think the proper use of feeding zeros to mfcc is using dither:
mfcc(zeros(1000), 16000, dither=true)
sorry for not attending this pull request, I was not aware of it.
Oh, I see! I must have overlooked that option in mfcc
. That certainly seems to resolve the issue of avoiding taking the log of 0. I wonder what your thoughts are on possibly having dither=true
as the default in the mfcc
function, or otherwise trying to programatically check the windows to see if the dither option is needed and applying it. Otherwise, a user would need to check the input themselves before calling mfcc
or react upon seeing NaN
.
For now, I think better documentation would be the first thing to do. Changing the default might have consequences for existing code.
This package is based on Dan Ellis's rastamat code, with an API starting out from the original matlab functions. Probably not the best design principles for an API.
I think Dan's work has continued in Python's librosa, which I suppose will/has become the default package for MFCC computation for most people entering the field. Ideally, we would have a very similar API, defaults, and even compatibility at the levels of computed mfcc-values
I think that makes sense. Thank you for sharing your thoughts!
When the waveform is all zeros, NaNs appear.
This is due in part to the log functions that are performed in the process of calculating the MFCCs, where if there are zeros, we end up with log(0). Adding an epsilon value to the 0s before taking the log fixed the issue.