pytorch / audio

Data manipulation and transformation for audio signal processing, powered by PyTorch
https://pytorch.org/audio
BSD 2-Clause "Simplified" License
2.54k stars 654 forks source link

Custom spectrogram filterbanks #943

Open dvisockas opened 4 years ago

dvisockas commented 4 years ago

🚀 Feature

It would be great to have an ability to create non-mel and not linear filterbanking.

Motivation

Even though mel-scale ceptstral coefficients are (probably) the most popular, there exists many other ways to filterbank the frequencies of spectra - CQCC, Opus filterbanks, BFCC, etc. I was thinking that it would be nice if the Spectrogram transform would accept a filterbank argument and filter the spectrogram accordingly.

Pitch

Extension of Spectrogram API to accept filter_bank argument (with a fallback to current implementation). Maybe it could be a tensor of shape (num_banks, filter) where filter would consist of (min_freq, max_freq) with a triangular filterbank for the start?

Additional context

Somewhat relates to #942

vincentqb commented 4 years ago

Thanks for the suggestion :) This relates to comment. Would #593 meet your needs? Thoughts?

dvisockas commented 4 years ago

593 is related to mel filter_banks as create_fb_matrix calculates mel filterbanks. I was thinking that it would be nice to somehow have an API that would enable users to create non_mel spectrograms. It is a simple operation of spectrogram * filter_bank, but implementing non-mel filterbanks is rather tedious because one has to recreate F.create_fb_matrix themselves

vincentqb commented 4 years ago

Yes, that would be useful, thanks for suggesting :) How about something similar to AmplitudeToDB then?