FunAudioLLM / SenseVoice

Multilingual Voice Understanding Model
https://funaudiollm.github.io/
Other
2.61k stars 249 forks source link

指定识别语音为中文,但是经常出现日文和英文,怎么解决 #72

Closed crystone123 closed 1 month ago

crystone123 commented 1 month ago

Notice: In order to resolve issues more efficiently, please raise issue following the template. (注意:为了更加高效率解决您遇到的问题,请按照模板提问,补充细节)

❓ Questions and Help

按照官方示例代码,model.generate接口指定中文参数,language="zn"。用得Funasr的实时语音识别方式,会出现日文和英文的情况。

代码如下: model_dir = "iic/SenseVoiceSmall" model = AutoModel( model=model_dir, trust_remote_code=True, ) files = os.listdir(path) store = [] run_time = [] for file in files: try: file_name = os.path.basename(file) st = time.time() chunk_stride = chunk_size[1] 960 # 600ms speech, sample_rate = soundfile.read("voice_test/" + file_name) cache = {} total_chunk_num = int(len((speech)-1)/chunk_stride+1) res_str = "" for i in range(total_chunk_num): speech_chunk = speech[ichunk_stride:(i+1)*chunk_stride] is_final = i == total_chunk_num - 1 res = model.generate( input=speech_chunk, cache={}, language="zn", # "zn", "en", "yue", "ja", "ko", "nospeech" use_itn=False, temperature = 0.01, ) final_res = post_processing(rich_transcription_postprocess(res[0]["text"])) run_time.append(time.time()-st) print(final_res) res_str = res_str + final_res print("耗时为{}".format(time.time()-st)) except Exception: print(f"发生了异常{file_name}")

请问有大神知道是为什么吗,还是说sensevoice不支持funasr的实时语音识别。谢谢

LauraGPT commented 1 month ago

SenseVoice is non-streaming model. We would release new streaming model in future.