ar1st0crat / NWaves

.NET DSP library with a lot of audio processing functions
MIT License
456 stars 71 forks source link

MFCC similar to NWaves in librosa #19

Closed jodusan closed 4 years ago

jodusan commented 4 years ago

I'm trying to replicate what NWaves does in python librosa lib. I've seen comments around the code so I guess there have been comparisons previously. Do you have any hints on how to generate similar looking mfc in librosa as in NWaves? Any points on what to look out for?

Thanks

ar1st0crat commented 4 years ago

Hi! I guess you should check first this wiki. You'll find a paragraph related to librosa here.

Still, there are couple of important nuances in librosa: 1) htk = true or false This parameter essentially defines the weights of mel-filterbank (HTK-style or Slaney-style). 2) centering In NWaves, like in many other frameworks, frames are not centered the way they are in librosa (in fact, I don't quite understand its purpose...), so this parameter must be set to False.

Let's just consider an example:

int sr = 22050;                  // sampling rate
int fftSize = 1024;
double lowFreq = 100;     // if not specified, will be 0
double highFreq = 8000; // if not specified, will be samplingRate / 2
int filterbankSize = 40;     // or 24 for htk=true (usually)

// if 'htk' parameter in librosa will be set to False:
var melBank1 = FilterBanks.MelBankSlaney(filterbankSize, fftSize, sr, lowFreq, highFreq);

// if 'htk' parameter in librosa will be set to True:
var melBands = FilterBanks.MelBands(filterbankSize, sr, lowFreq, highFreq);
var melBank2 = FilterBanks.Triangular(fftSize, sr, melBands, null, Scale.HerzToMel);

var opts = new MfccOptions
{
    SamplingRate = sr,
    FrameDuration = (double)fftSize / sr,
    HopDuration = 0.010,
    FeatureCount = 12,
    Filterbank = melBank1,  // or MelBank2
    NonLinearity = NonLinearityType.ToDecibel, // mandatory
    Window = WindowTypes.Hamming,     // in librosa 'hann' is by default
    LogFloor = 1e-10f,  // mandatory
    DctType="2N",
    LifterSize = 0
};
var e = new MfccExtractor(opts);

In librosa:

mfccs = librosa.feature.mfcc(y, sr, n_mfcc=13,
dct_type=2, norm='ortho', window='hamming',
htk=False, n_mels=40, fmin=100, fmax=8000,
n_fft=1024, hop_length=int(0.010*sr), center=False)

Actually there are even more options. Feel free to ask, if you have any questions.

jodusan commented 4 years ago

@ar1st0crat Thank you for the detailed response!