Closed yoshua closed 11 years ago
George Dahl's preprocessing for speech data is log mel-scale filterbanks, basically the MFCCs before the cepstral transform, on 25ms windows shifted by either 5ms or 10ms. It would be good to have this preprocessing handy as well.
Pretty much done, see https://github.com/nouiz/lisa_emotiw/wiki/06.-Emotion-Classification-from-Speech
Please close the ticket when done.
Use existing libraries for extracting spectral features from audio and compute these features on the challenge data.