微调seaco_paraformer模型单机多卡出现错误

modelscope / FunASR

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

https://www.funasr.com

Other

6.45k stars 687 forks source link

微调seaco_paraformer模型单机多卡出现错误 #1584

Closed zyjcsf closed 6 months ago

zyjcsf commented 6 months ago

您好：我想请教一下：，funasr1.0 微调热词模型，单机多卡出现上述问题，会是啥原因？单机单卡可以跑通

环境安装： python==3.8.5 torch==2.0.1 funasr==1.0.19 GPU：3090

使用模型： [[https://www.modelscope.cn/models/damo/speech_seaco_paraformer_large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/summary]] 参考示例： https://github.com/alibaba-damo-academy/FunASR/blob/main/examples/industrial_data_pretraining/seaco_paraformer/finetune.sh

代码修改将GPU卡设置成2卡（单卡可以跑通） export CUDA_VISIBLE_DEVICES="0,1"

错误信息： RuntimeError： Expected to have finished reduction in the prior iteration before starting a new one

zyjcsf commented 6 months ago

目前把find_unused_parameters设为True后可以跑，但是不知道会不会影响效果

R1ckShi commented 6 months ago

目前把find_unused_parameters设为True后可以跑，但是不知道会不会影响效果

感谢反馈，不会影响效果，应该将find_unused_parameters设为True

zihaozhu93 commented 3 months ago

目前把find_unused_parameters设为True后可以跑，但是不知道会不会影响效果

这个参数在哪里修改呢？尝试在finetune.sh脚本里设置++train_conf.find_unused_parameters=true，依然报错提示需要设置find_unused_parameters=True，看打印日志++train_conf.find_unused_parameters=true，源码查了也查明白哪里有问题

SeniorGlassMaster commented 1 month ago

目前把find_unused_parameters设为True后可以跑，但是不知道会不会影响效果

这个参数在哪里修改呢？尝试在finetune.sh脚本里设置++train_conf.find_unused_parameters=true，依然报错提示需要设置find_unused_parameters=True，看打印日志++train_conf.find_unused_parameters=true，源码查了也查明白哪里有问题

在FunASR/funasr/train_utils/trainer_ds.py文件中，见下图，简单粗暴直接改代码就成，或者你在配置文件里加上