modelscope / FunASR

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
https://www.funasr.com
Other
4.97k stars 541 forks source link

基于damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch模型微调,运行finetune.py报错 #1110

Open sunneam opened 8 months ago

sunneam commented 8 months ago

image 环境:linux python=3.9.0 torch=2.1.1 funasr=0.8.4 modelscope=1.9.1

hnluo commented 8 months ago

Please check training data,format reference (https://alibaba-damo-academy.github.io/FunASR/en/egs_modelscope/asr/TEMPLATE/README.html#finetune-with-your-data)

sunneam commented 8 months ago

image image 数据看起来是没问题的

LauraGPT commented 8 months ago

Please check the wav file carefully. Firstly, the wav path exists or not. Secondly, the duration of wav is longer than 25ms.

C-rawler commented 4 months ago

Please check training data,format reference (https://alibaba-damo-academy.github.io/FunASR/en/egs_modelscope/asr/TEMPLATE/README.html#finetune-with-your-data)

您好,我单卡训练没问题,但是多卡训练报错了,我的启动命令是CUDA_VISIBLE_DEVICES=2,3 python -m torch.distributed.launch --nproc_per_node 2 finetune.py 报错如下: Task related config: error: unrecognized arguments: --local-rank=0 ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 2) local_rank: 0 (pid: 185479) of binary: /opt/conda/envs/modelscope/bin/python Traceback (most recent call last): File "/opt/conda/envs/modelscope/lib/python3.8/runpy.py", line 194, in _run_module_as_main return _run_code(code, main_globals, None, File "/opt/conda/envs/modelscope/lib/python3.8/runpy.py", line 87, in _run_code exec(code, run_globals) File "/opt/conda/envs/modelscope/lib/python3.8/site-packages/torch/distributed/launch.py", line 196, in main() File "/opt/conda/envs/modelscope/lib/python3.8/site-packages/torch/distributed/launch.py", line 192, in main launch(args) File "/opt/conda/envs/modelscope/lib/python3.8/site-packages/torch/distributed/launch.py", line 177, in launch run(args) File "/opt/conda/envs/modelscope/lib/python3.8/site-packages/torch/distributed/run.py", line 785, in run elastic_launch( File "/opt/conda/envs/modelscope/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 134, in call return launch_agent(self._config, self._entrypoint, list(args)) File "/opt/conda/envs/modelscope/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 250, in launch_agent raise ChildFailedError( torch.distributed.elastic.multiprocessing.errors.ChildFailedError

请问能解答一下是什么原因吗