flashlight / wav2letter

Facebook AI Research's Automatic Speech Recognition Toolkit
https://github.com/facebookresearch/wav2letter/wiki
Other
6.37k stars 1.01k forks source link

Python inference to convert wav to text #944

Closed AlexandrWh closed 3 years ago

AlexandrWh commented 3 years ago

Hi, I've built python bindings And now I'm interested in, how to make inference from wav to text This tutorial doesn't give me usefull information I need python script, which reads .wav file and prints recognized text using wav2letter library

tlikhomanenko commented 3 years ago

Python bindings supports only featurization and beam-search decoding where predictions from the network are provided. So that people who trained acoustic models with tensorflow/pytorch could reuse beam-search decoding from python.

Wav2letter is c++ and models trained with it can be used for inference with Decode.cpp binary.

Here we have colab example (all tutorials are here) with recent codebase how one can do inference with the CTC acoustic model and some n-gram language model. So this is binary call with interactive regime, so that you can pass path to audio and transcription will be printed on the screen, and so on.