modelscope / FunASR

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
https://www.funasr.com
Other
6.5k stars 688 forks source link

funasr叠字问题汇总 #1416

Closed Maclaurin closed 7 months ago

Maclaurin commented 7 months ago

What is your question?

以下压缩文件是对五篇文章转文字后的叠字情况的汇总,叠字部分已用红色加粗标出: output1.zip 注:文章中的阿拉伯数字为ITN后的结果

What have you tried?

What's your environment?

LauraGPT commented 7 months ago

Could you please upload the wav?

Maclaurin commented 7 months ago

output5_wav.zip output5.html内叠字所对应的wav文件。

LauraGPT commented 7 months ago

I have tested it in your html5: "大概走了一个持续震荡上涨涨的一个行情" wav: "asr_res_20240219_3.wav"

But in my test, the outputs is:

'啊,因为年初从年初走到四月底,大概就是走了一个呃持续震荡上涨的一个行情,从一千八百二走到了最高二零八零,对吧?'

Maybe you could update funasr and try it again.

Maclaurin commented 7 months ago

我们使用modelscope+funasr的结构 版本: funasr :1.0.0 modelscope:1.12.0 代码: from modelscope.pipelines import pipeline from modelscope.utils.constant import Tasks

inference_pipeline = pipeline( task=Tasks.auto_speech_recognition, model='iic/speech_paraformer-large-contextual_asr_nat-zh-cn-16k-common-vocab8404', model_revision="v2.0.4", vad_model='iic/speech_fsmn_vad_zh-cn-16k-common-pytorch', vad_model_revision="v2.0.4", punc_model='iic/punc_ct-transformer_zh-cn-common-vocab272727-pytorch', punc_model_revision="v2.0.4")

rec_result = inference_pipeline('/home/ubuntu/test_data_convert/RS000000000205806.mp3', hotword='') print(rec_result) 叠字举例: ‘大概就是一千九到一千九百之之间这么一价价位。然然后之之后的一段时间呢’ ‘其实从呃这个五月月份触底之后呢’ 详细见word output.docx 录音: RS000000000205806.zip

Maclaurin commented 7 months ago

我们使用modelscope+funasr的结构 版本: funasr :1.0.0 modelscope:1.12.0 代码: from modelscope.pipelines import pipeline from modelscope.utils.constant import Tasks

inference_pipeline = pipeline( task=Tasks.auto_speech_recognition, model='iic/speech_paraformer-large-contextual_asr_nat-zh-cn-16k-common-vocab8404', model_revision="v2.0.4", vad_model='iic/speech_fsmn_vad_zh-cn-16k-common-pytorch', vad_model_revision="v2.0.4", punc_model='iic/punc_ct-transformer_zh-cn-common-vocab272727-pytorch', punc_model_revision="v2.0.4")

rec_result = inference_pipeline('/home/ubuntu/test_data_convert/RS000000000205806.mp3', hotword='') print(rec_result) 叠字举例: ‘大概就是一千九到一千九百之之间这么一价价位。然然后之之后的一段时间呢’ ‘其实从呃这个五月月份触底之后呢’ 详细见word output.docx 录音: RS000000000205806.zip

使用的是热词版模型

Maclaurin commented 7 months ago

我们使用modelscope+funasr的结构 版本: funasr :1.0.0 modelscope:1.12.0 代码: from modelscope.pipelines import pipeline from modelscope.utils.constant import Tasks inference_pipeline = pipeline( task=Tasks.auto_speech_recognition, model='iic/speech_paraformer-large-contextual_asr_nat-zh-cn-16k-common-vocab8404', model_revision="v2.0.4", vad_model='iic/speech_fsmn_vad_zh-cn-16k-common-pytorch', vad_model_revision="v2.0.4", punc_model='iic/punc_ct-transformer_zh-cn-common-vocab272727-pytorch', punc_model_revision="v2.0.4") rec_result = inference_pipeline('/home/ubuntu/test_data_convert/RS000000000205806.mp3', hotword='') print(rec_result) 叠字举例: ‘大概就是一千九到一千九百之之间这么一价价位。然然后之之后的一段时间呢’ ‘其实从呃这个五月月份触底之后呢’ 详细见word output.docx 录音: RS000000000205806.zip

使用的是热词版模型

@LauraGPT