How to infer a single wav file in speech2text?

GabrielLin commented 5 years ago

Could you please tell me how to infer a single wav file in speech2text? Thanks.

blisc commented 5 years ago

Add infer_params to your config file if it doesn't already exist. Inside the infer_parms, there is a parameter called vocab_file. Make sure that points to an infer csv file.

Inside the infer csv file, the first line should be a header line followed by a line containing the path to your wav file. For example, your infer csv file should look like such:

wav_filename, wav_filesize, transcript
/path/to/your/sound.wav, UNUSED, UNUSED

Then run the model in infer mode python run.py --config_file=YOURCONFIGFILE --mode=infer --infer_output_file=output.txt

GabrielLin commented 5 years ago

Thanks for your answer @blisc .

ffxz commented 5 years ago

@blisc when i use cpu inference, it occurs error. do you know how to deal with it? thanks!

arashdehghani commented 5 years ago

hi @blisc could you infer with cpu?

blisc commented 5 years ago

Inference on cpu works fine using Jasper on my machine. Do you have any details why it fails?

ghost commented 4 years ago

Add infer_params to your config file if it doesn't already exist. Inside the infer_parms, there is a parameter called vocab_file. Make sure that points to an infer csv file.

Inside the infer csv file, the first line should be a header line followed by a line containing the path to your wav file. For example, your infer csv file should look like such:
wav_filename, wav_filesize, transcript
/path/to/your/sound.wav, UNUSED, UNUSED
Then run the model in infer mode python run.py --config_file=YOURCONFIGFILE --mode=infer --infer_output_file=output.txt

I tried to following this, but got this error: 2 root error(s) found. (0) Invalid argument: Assign requires shapes of both tensors to match. lhs shape= [3] rhs shape= [29] [[node save_1/Assign_14 (defined at /OpenSeq2Seq/OpenSeq2Seq/open_seq2seq/utils/funcs.py:240) ]] [[save_1/RestoreV2/_0]] (1) Invalid argument: Assign requires shapes of both tensors to match. lhs shape= [3] rhs shape= [29] [[node save_1/Assign_14 (defined at /OpenSeq2Seq/OpenSeq2Seq/open_seq2seq/utils/funcs.py:240) ]]

Could you explain more clearly? Thanks

NVIDIA / OpenSeq2Seq

How to infer a single wav file in speech2text? #288