Closed janjanusek closed 7 months ago
Hi, here's what you're looking for: https://github.com/ar1st0crat/NWaves/wiki/MFCC-and-Mel-Spectrogram#nwaves-and-python_speech_features . Also, there's a detailed video (link at the top of the page)
FilterBank = PsfFilterbank(samplingRate, melCount, fftSize).
// ...
/// <summary>
/// Generates filterbank with weights identical to python_speech_features.
/// </summary>
float[][] PsfFilterbank(int samplingRate, int filterbankSize, int fftSize, double lowFreq = 0, double highFreq = 0)
{
var filterbank = new float[filterbankSize][];
if (highFreq <= lowFreq)
{
highFreq = samplingRate / 2;
}
var low = NWaves.Utils.Scale.HerzToMel(lowFreq);
var high = NWaves.Utils.Scale.HerzToMel(highFreq);
var res = (fftSize + 1) / (float)samplingRate;
var bins = Enumerable
.Range(0, filterbankSize + 2)
.Select(i => (float)Math.Floor(res * NWaves.Utils.Scale.MelToHerz(low + i * (high - low) / (filterbankSize + 1))))
.ToArray();
for (var i = 0; i < filterbankSize; i++)
{
filterbank[i] = new float[fftSize / 2 + 1];
for (var j = (int)bins[i]; j < (int)bins[i + 1]; j++)
{
filterbank[i][j] = (j - bins[i]) / (bins[i + 1] - bins[i]);
}
for (var j = (int)bins[i + 1]; j < (int)bins[i + 2]; j++)
{
filterbank[i][j] = (bins[i + 2] - j) / (bins[i + 2] - bins[i + 1]);
}
}
return filterbank;
}
UPD. and fbank
is basically a filterbank extractor.
def fbank(signal,samplerate=16000,winlen=0.025,winstep=0.01,
nfilt=26,nfft=512,lowfreq=0,highfreq=None,preemph=0.97)
is equivalent to:
var fbankExtractor = new FilterbankExtractor(
new FilterbankOptions
{
SamplingRate = 16000,
FrameDuration = 0.025,
FftSize = 512,
HopDuration = 0.01,
Window = WindowType.Hann,
PreEmphasis=0.97,
FilterBank = PsfFilterbank(16000, 26, 512, 0)
});
Thank you very much, I really appreciate your effort 👍 I'll test it later today
works! I don't have same features comparing to python code they're slightly different which is totally normal due some differences, but over all it works just like I needed. Thanks!
Hello, how can I simulate fbanks feature from https://github.com/jameslyons/python_speech_features? I checked out wiki but could not find fbanks calculation although something tells me that this library is more than capable of computing this features.
Thank you