flashlight / wav2letter

Facebook AI Research's Automatic Speech Recognition Toolkit
https://github.com/facebookresearch/wav2letter/wiki
Other
6.37k stars 1.01k forks source link

Add custom feature extraction #218

Closed dambaquyen96 closed 5 years ago

dambaquyen96 commented 5 years ago

Currently, i'm working on Vietnamese Speech dataset. Beside MFCC, the pitch feature is also very helpful for ASR in my language and I want to implement it. So how can I add a custom feature extraction in wav2letter project? Is there any interface that help me to do it? And how to use it in training with flags (e.g. -mfcc, -mfsc, ...)?

vineelpratap commented 5 years ago

Hi, The feature extraction library is here. You would have to add a new file for pitch extraction similar to https://git.io/fhN6G which takes a vector of input and extracts the required feature. Once this is done, you can tweak https://git.io/fhNy8 to make it work end to end.

We would gladly accept a PR on this since this will be helpful for many other tasks.

xuqiantong commented 5 years ago

@vineelpratap I don't think you want ppl to see something through fburl =p. Here is the valid link: https://github.com/facebookresearch/wav2letter/blob/master/src/data/Featurize.cpp#L83-L97.

dambaquyen96 commented 5 years ago

@vineelpratap @xuqiantong Thanks for your supporting ^^