For folks (like me) who are new to Mel Spectrogram in audio signal processing here's a link that helped me get some context.
On searching for associated torchaudio Ops, I found that there is a MelScale Op in torchaudio.transforms that converts an STFT to a mel scaled STFT ( which is sort of what the MelWeightMatrix is supposed to do after we create it).
Looking at the source for MelScale, it's a matmul (+ some transposes to align everything) between the input STFT with the melscale_fbanks matrix which is what we expect from onnx.MelWeightMatrix
Onnx Docu
For folks (like me) who are new to Mel Spectrogram in audio signal processing here's a link that helped me get some context.
On searching for associated torchaudio Ops, I found that there is a
MelScale
Op in torchaudio.transforms that converts an STFT to a mel scaled STFT ( which is sort of what the MelWeightMatrix is supposed to do after we create it). Looking at the source for MelScale, it's a matmul (+ some transposes to align everything) between the input STFT with the melscale_fbanks matrix which is what we expect fromonnx.MelWeightMatrix
So there is almost a 1:1 mapping from onnx.MelWeightMatrix to melscale_fbanks
One thing to note when actually using in a model is that onnx.MelWeightMatrix requires :
Whereas melscale_fbanks notes that :