JuliaDSP / MFCC.jl

Mel Frequency Cepstral Coefficients calculation for Julia
BSD 2-Clause "Simplified" License
33 stars 18 forks source link

Add epsilon value before log in spec2cep #14

Closed maetshju closed 4 years ago

maetshju commented 6 years ago

When the waveform is all zeros, NaNs appear.

using MFCC
println(mfcc(zeros(1000), 16000)[1])
# [-Inf NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN; ... ]

This is due in part to the log functions that are performed in the process of calculating the MFCCs, where if there are zeros, we end up with log(0). Adding an epsilon value to the 0s before taking the log fixed the issue.

using MFCC
println(mfcc(zeros(1000), 16000)[1])
# [-4708.25 -7.29149e-14 4.66009e-13 -1.89956e-12 2.17192e-12 -1.39894e-12 -2.38229e-12 -2.9143e-12 6.25616e-13 -6.56793e-13 1.79075e-11 5.79803e-12 1.21636e-11; ... ]
madprogramer commented 4 years ago

Someone please merge this!

davidavdav commented 4 years ago

I think the proper use of feeding zeros to mfcc is using dither:

mfcc(zeros(1000), 16000, dither=true)

sorry for not attending this pull request, I was not aware of it.

maetshju commented 4 years ago

Oh, I see! I must have overlooked that option in mfcc. That certainly seems to resolve the issue of avoiding taking the log of 0. I wonder what your thoughts are on possibly having dither=true as the default in the mfcc function, or otherwise trying to programatically check the windows to see if the dither option is needed and applying it. Otherwise, a user would need to check the input themselves before calling mfcc or react upon seeing NaN.

davidavdav commented 4 years ago

For now, I think better documentation would be the first thing to do. Changing the default might have consequences for existing code.

This package is based on Dan Ellis's rastamat code, with an API starting out from the original matlab functions. Probably not the best design principles for an API.

I think Dan's work has continued in Python's librosa, which I suppose will/has become the default package for MFCC computation for most people entering the field. Ideally, we would have a very similar API, defaults, and even compatibility at the levels of computed mfcc-values

maetshju commented 4 years ago

I think that makes sense. Thank you for sharing your thoughts!