revdotcom / reverb

Open source inference code for Rev's model
https://www.rev.com
Apache License 2.0
297 stars 17 forks source link

电话录音对话,啥也没有输出,咋回事呢 #15

Closed njhouse365 closed 4 days ago

njhouse365 commented 1 week ago

python3 wenet/bin/recognize_wav.py \ --config /home/house365ai/xxm/reverb/asr/model/config.yaml \ --checkpoint /home/house365ai/xxm/reverb/asr/model/reverb_asr_v1.pt \ --audio /home/house365ai/xxm/reverb/asr/data/07.wav \ --modes ctc_prefix_beam_search attention_rescoring \ --gpu 7 \ --verbatimicity 1.0 \ --result_dir /home/house365ai/xxm/reverb/asr/result

jprobichaud commented 1 week ago

Is it possible for you to share the audio file ?

njhouse365 commented 1 week ago

07.zip

njhouse365 commented 1 week ago

phone record

jprobichaud commented 1 week ago

Thanks. Your file is stereo and encoded with some, something that the current code doesn't handle explicitly. As such, the features extracted are quite different than what converting the audio to mono would do.

That being said, the audio doesn't contain English speech and for that reason, even when I downmix to mono and process the file, what I get is just a series of tokens. This model can only handle English for now.

njhouse365 commented 4 days ago

oh,thanks