meyda / meyda

Audio feature extraction for JavaScript.
https://meyda.js.org/
MIT License
1.45k stars 104 forks source link

Is mel-spectrum available? #1177

Open Oortone opened 2 years ago

Oortone commented 2 years ago

Write your question or issue with as much detail as possible

I notice in the code that mel filter banks are there, but are these used only for the mfcc-extraction or can I get these output directly as features too? (Without the cepstral transform).

I believe it's called MelScaled Spectrum and better represents human hearing than direct FFT-spectrum and sometimes have advantages over MFCC.

I can't find any selector for it.

hughrawlinson commented 2 years ago

Yes, you can get the raw mel filter banks by calling the createMelFilterBank function from meyda/dist/esm/utilities. I made a quick demo here. Heads up: I suspect that future versions of Meyda might change how we export this, but I haven't decided yet, so keep an eye on the 6.0.0 breaking changes when that comes out.

Let me know if I can help with anything else!

Oortone commented 2 years ago

Ok, I see, but I need to add that in the Node client version somehow to make use of it offline? Probably easy for anyone familiar with Node, but I'm not sure how to add that into the extraction flow easily. Thanks anyway.

hughrawlinson commented 2 years ago

I can write you a little CLI to output the Mel Filter Bank - but I think we might be understanding the Mel Filter Bank in different ways. From my understanding of the code, the Mel Filter Bank isn't based on the signal, and therefore isn't an extracted feature - it's a list of filter coefficient sets that are used when computing the MFCCs. So in the CLI, they would just stay the same for every chunk of the audio input.

As for the MelScaled Spectrum, I would think that would mean a spectrum where the X-axis is Mels, rather than frequency. That's a different thing to the Mel filter bank. Is that what you're hoping to get?

Oortone commented 2 years ago

Ah, yes. Mel Scaled Spectrum is what I mean. Isn't that generated by a matrix calculation of the amplitudeSpectrum and the mel filter bank? Seems it might be posiible to generate using Tensorflow, there's: tf.signal.linear_to_mel_weight_matrix

Don't know if it's in the .js version but the matrix can probably be exported. But of course, having it as features directly is the best :-)

hughrawlinson commented 2 years ago

Yes, I think having it as a feature would be good! I'll convert this issue to represent a feature request.