facebookresearch / AudioMAE

This repo hosts the code and models of "Masked Autoencoders that Listen".
Other
548 stars 45 forks source link

kaldi fbank #25

Open lix4 opened 1 year ago

lix4 commented 1 year ago

Hi there,

I am wondering what does fbank really give us in the dataloader? I went to torchaudio doc and did not find much info about what it is. Does anyone have a link to its explanation?

Thank you,

XTxiatong commented 6 months ago

I do have the same question. In the paper, it claims that 'we transform audio recordings into Mel spectrograms and divide them into non-overlapped regular grid patches', but it seems the codebase used fbank instead of spectrogram, any reason?