NervanaSystems / deepspeech

DeepSpeech neon implementation
Apache License 2.0
222 stars 69 forks source link

Evaluating from a single .wav file #21

Closed saurabhvyas closed 7 years ago

saurabhvyas commented 7 years ago

Is there a way to predict text from a trained model from a single .wav file ? I couldnt find this in documentation

tyler-nervana commented 7 years ago

The simplest way is to just create a custom manifest file with any .wav files and comparison transcripts that you have. Take a look at the ones you have generated for librispeech training and evaluation as a reference. If you do not have transcriptions for the file, you'll have to change things a bit more.

The manifest file for to just load the audio data should just have one filename per line. The dataloader config dictionary should then have type="audio". You can then iterate over your .wav files and use speech.utils.get_outputs to propagate the audio through the network and decoder.decode to compute the "arg-max" decoding.

I hope this helps!

saurabhvyas commented 7 years ago

It makes some sense , It provides me a good starting point , Thanks ! Hopefully in future this process will be more simple ( like a terminal command for this )

saurabhvyas commented 7 years ago

I just wanted to ask if evaluation of pre trained model will work on a cpu only system (It works on GPU machine ) ?

pankaj2701 commented 7 years ago

Yes evaluation does works on a CPU only system also use the flag -b cpu