Open jlljill opened 3 months ago
Notice: In order to resolve issues more efficiently, please raise issue following the template. (注意:为了更加高效率解决您遇到的问题,请按照模板提问,补充细节)
中文离线转写服务(CPU)如何配置说话人识别模块, 目前没有看到说话人识别的model配置?
nohup bash run_server.sh \ --download-model-dir /workspace/models \ --vad-dir damo/speech_fsmn_vad_zh-cn-16k-common-onnx \ --model-dir damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-onnx \ --punc-dir damo/punc_ct-transformer_zh-cn-common-vocab272727-onnx \ --lm-dir damo/speech_ngram_lm_zh-cn-ai-wesp-fst \ --itn-dir thuduj12/fst_itn_zh \ --certfile 0 \ --decoder-thread-num 32 \ --io-thread-num 8 \ --hotword /workspace/models/hotwords.txt > log.out 2>&1 &
调用model已经可以实现
from funasr import AutoModel
model = AutoModel(model="paraformer-zh", model_revision="v2.0.4", vad_model="fsmn-vad", vad_model_revision="v2.0.4", punc_model="ct-punc-c", punc_model_revision="v2.0.4", spk_model="cam++", spk_model_revision="v2.0.2", ) res = model.generate(input=f"E:\360MoveData\Users\jll\Desktop\test.wav", batch_size_s=300, hotword='') print(res)
We are working on it now.
同问
同问,大概什么时候可以上线
GPU的什么时候上线啊?
现在这个可以用了吗
不能再离线服务上使用说话人识别模块吗,只能是调用model实现吗
@LauraGPT 请问这个有进展了吗
Notice: In order to resolve issues more efficiently, please raise issue following the template. (注意:为了更加高效率解决您遇到的问题,请按照模板提问,补充细节)
❓ Questions and Help
中文离线转写服务(CPU)如何配置说话人识别模块, 目前没有看到说话人识别的model配置?
Code
nohup bash run_server.sh \ --download-model-dir /workspace/models \ --vad-dir damo/speech_fsmn_vad_zh-cn-16k-common-onnx \ --model-dir damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-onnx \ --punc-dir damo/punc_ct-transformer_zh-cn-common-vocab272727-onnx \ --lm-dir damo/speech_ngram_lm_zh-cn-ai-wesp-fst \ --itn-dir thuduj12/fst_itn_zh \ --certfile 0 \ --decoder-thread-num 32 \ --io-thread-num 8 \ --hotword /workspace/models/hotwords.txt > log.out 2>&1 &
What have you tried?
调用model已经可以实现
from funasr import AutoModel
paraformer-zh is a multi-functional asr model
use vad, punc, spk or not as you need
model = AutoModel(model="paraformer-zh", model_revision="v2.0.4", vad_model="fsmn-vad", vad_model_revision="v2.0.4", punc_model="ct-punc-c", punc_model_revision="v2.0.4", spk_model="cam++", spk_model_revision="v2.0.2", ) res = model.generate(input=f"E:\360MoveData\Users\jll\Desktop\test.wav", batch_size_s=300, hotword='') print(res)
What's your environment?