fastaudio / fastai_audio

[DEPRECATED] 🔊️ Audio with fastaiv1
MIT License
160 stars 49 forks source link

Mfcc feature #4

Closed rbracco closed 5 years ago

rbracco commented 5 years ago

Add MFCC (mel-frequency cepstral coefficient) generation (turn on by setting mfcc=True in config) as an alternative option to melspectrogram. It is already torchified thanks to torchaudio people. We can add additional spectral features via librosa but we then have to convert to numpy, caching should help though.

I also added delta/accelerate stacking options (turn on by setting delta=True in config). Instead of copying the spectrogram or MFCC to 3 channels, this instead uses the 1st and 2nd derivative of the spectrogram (a somewhat common practice in AudioML) for the 2nd and 3rd channels. These are cached as well if cache option is on because they take several ms to convert to numpy, generate, and convert back to torch.