flashlight / wav2letter

Facebook AI Research's Automatic Speech Recognition Toolkit
https://github.com/facebookresearch/wav2letter/wiki
Other
6.35k stars 1.02k forks source link

[How can I do live transcription from my microphone ?] #938

Closed lematmat closed 3 years ago

lematmat commented 3 years ago

HI all,

I do not figure out the python script to do live transcription from my microphone.

Regards,

lmm

Additional Context: Wav2letter 0.2 - Python3.7

[Add any additional information here]

tlikhomanenko commented 3 years ago

Did you train acoustic model with wav2letter? We don't have python script to do inference, only wrapper to do inference with calling c++ process from python if model is trained in the wav2letter.

As an option for models trained in python (pytroch, tensorflow) we have beam-search decoder available from python bindings where you need to pass the predictions of the network and lm to the python binding of the beam-search decoder.

Let me know what is your use case.

lematmat commented 3 years ago

Ok, thank you, I've found Pytorch binding but I can't find TensorFlow ones ?

tlikhomanenko commented 3 years ago

it is not pytorch, it is python. Only ASG criterion is written with pytorch, the rest, like audio featurization and beam search decoder doesn't depend on any dnn framework. Please have a look example https://github.com/facebookresearch/flashlight/blob/master/bindings/python/example/decoder_example.py#L207 - here beam-search decoder is running and you pass predictions from the network where network could be either tensorflow / pytorch / mxnet / whatever.

lematmat commented 3 years ago

Ok, thank you, that's what I was looking for !

tlikhomanenko commented 3 years ago

Feel free to open another issue if you have some question or trouble to run.