A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
2023-06-09 11:20:34,255 - modelscope - INFO - PyTorch version 1.11.0+cu113 Found.
2023-06-09 11:20:34,260 - modelscope - INFO - Loading ast index from /mnt/workspace/.cache/modelscope/ast_indexer
2023-06-09 11:20:34,297 - modelscope - INFO - Loading done! Current index file version is 1.6.1, with md5 c661f1c586a773fd9e04a6031d0d6d1e and a total number of 849 components indexed
2023-06-09 11:20:35,745 - modelscope - INFO - Use user-specified model revision: v1.0.5
2023-06-09 11:20:36,044 - modelscope - INFO - initiate model from /mnt/workspace/.cache/modelscope/damo/speech_diarization_sond-zh-cn-alimeeting-16k-n16k4-pytorch
2023-06-09 11:20:36,044 - modelscope - INFO - initiate model from location /mnt/workspace/.cache/modelscope/damo/speech_diarization_sond-zh-cn-alimeeting-16k-n16k4-pytorch.
2023-06-09 11:20:36,045 - modelscope - INFO - initialize model from /mnt/workspace/.cache/modelscope/damo/speech_diarization_sond-zh-cn-alimeeting-16k-n16k4-pytorch
2023-06-09 11:20:36,048 - modelscope - WARNING - No preprocessor field found in cfg.
2023-06-09 11:20:36,048 - modelscope - WARNING - No val key and type key found in preprocessor domain of configuration.json file.
2023-06-09 11:20:36,048 - modelscope - WARNING - Cannot find available config to build preprocessor at mode inference, current config: {'model_dir': '/mnt/workspace/.cache/modelscope/damo/speech_diarization_sond-zh-cn-alimeeting-16k-n16k4-pytorch'}. trying to build by task and model information.
2023-06-09 11:20:36,048 - modelscope - WARNING - No preprocessor key ('generic-sv', 'speaker-diarization') found in PREPROCESSOR_MAP, skip building preprocessor.
2023-06-09 11:20:36,490 - modelscope - INFO - Use user-specified model revision: v1.2.2
2023-06-09 11:20:36,786 - modelscope - INFO - loading speaker verification model from /mnt/workspace/.cache/modelscope/damo/speech_xvector_sv-zh-cn-cnceleb-16k-spk3465-pytorch ...
2023-06-09 11:20:44,023 - modelscope - INFO - Speaker Diarization Processing: ['./2-16000.wav', './2-person-16000.wav'] ...
2023-06-09 11:20:44,023 (speaker_diarization_pipeline:234) INFO: Speaker Diarization Processing: ['./2-16000.wav', './2-person-16000.wav'] ...
/root/FunASR/funasr/models/encoder/resnet34_encoder.py:56: UserWarning: __floordiv__ is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor').
ilens = (ilens + 1) // self.stride
/opt/conda/lib/python3.8/site-packages/torch/nn/functional.py:3704: UserWarning: nn.functional.upsample is deprecated. Use nn.functional.interpolate instead.
warnings.warn("nn.functional.upsample is deprecated. Use nn.functional.interpolate instead.")
{'text': 'spk1 [(0.0, 70.32)]'}
runtime
funasr version
model
code
output
question
按照 官方示例 运行后,并未得到分离后的音频,然后特意设置
output_dir
也没有效果,不确定是什么原因,也没有报错