modelscope / FunASR

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
https://www.funasr.com
Other
6.8k stars 720 forks source link

assert num_peak == len(char_list) + 1 # number of peaks is supposed to be number of tokens + 1 #731

Closed gz-d closed 1 year ago

gz-d commented 1 year ago

OS: Linux Python version: 3.9.16 Torch version: 2.0.1 Torchaudio version: 2.0.2 Modelscope version: 1.7.0 Funasr version: 0.6.7 Model: speech_timestamp_prediction-v1-16k-offline Command: bash infer.sh

Details: 我在尝试用speech_timestamp_prediction-v1-16k-offline对输入音频的时间戳进行预测,但是遇到了如下问题: Traceback (most recent call last): File "/mnt/sgnfsmeta/guanzhideng/huiren_project/data_processing/paraformer/timestamp/infer.py", line 29, in modelscope_infer(args) File "/mnt/sgnfsmeta/guanzhideng/huiren_project/data_processing/paraformer/timestamp/infer.py", line 16, in modelscope_infer inference_pipeline(audio_in=args.audio_in, text_in=args.text_in) File "/home/guanzdeng2/.conda/envs/paraformer/lib/python3.9/site-packages/modelscope/pipelines/audio/timestamp_pipeline.py", line 184, in call output = self.forward(self.audio_in, self.text_in, kwargs) File "/home/guanzdeng2/.conda/envs/paraformer/lib/python3.9/site-packages/modelscope/pipelines/audio/timestamp_pipeline.py", line 300, in forward tp_result = self.run_inference(self.cmd, kwargs) File "/home/guanzdeng2/.conda/envs/paraformer/lib/python3.9/site-packages/modelscope/pipelines/audio/timestamp_pipeline.py", line 307, in run_inference tp_result = self.funasr_infer_modelscope( File "/home/guanzdeng2/.conda/envs/paraformer/lib/python3.9/site-packages/funasr/bin/tp_inference_launch.py", line 148, in _forward ts_str, ts_list = ts_prediction_lfr6_standard(us_alphas[batch_id], us_cif_peak[batch_id], token, File "/home/guanzdeng2/.conda/envs/paraformer/lib/python3.9/site-packages/funasr/utils/timestamp_tools.py", line 39, in ts_prediction_lfr6_standard assert num_peak == len(char_list) + 1 # number of peaks is supposed to be number of tokens + 1 AssertionError

我尝试print num_peak和len(char_list),发现有的数据是满足num_peak == len(char_list) + 1的,但是有的是num_peak == len(char_list)。请问我需要对数据或者代码进行什么调整呢? 116767b4d24c5d4a01e533b627ee8a0

hnluo commented 1 year ago

Bug fixed, please update funasr。update demo for using tp model with long audio