wenet-e2e / wenet

Production First and Production Ready End-to-End Speech Recognition Toolkit
https://wenet-e2e.github.io/wenet/
Apache License 2.0
3.87k stars 1.03k forks source link

paraformer模型训练报错 #2542

Closed didi222-lqq closed 1 month ago

didi222-lqq commented 1 month ago

在执行完以下命令后 image 把词典替换成自己的,在训练配置文件里修改参数对应自己的词典 image

执行run.sh第0步报错 rank5: Traceback (most recent call last): rank5: File "/data/run01/scz0arv/lindaodi/wenet2/wenet/examples/aishell/paraformer/wenet/bin/train.py", line 178, in

rank5: File "/HOME/scz0arv/.conda/envs/wenet2/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/init.py", line 347, in wrapper rank5: return f(*args, **kwargs) rank5: File "/data/run01/scz0arv/lindaodi/wenet2/wenet/examples/aishell/paraformer/wenet/bin/train.py", line 94, in main rank5: model, configs = init_model(args, configs) rank5: File "/data/run01/scz0arv/lindaodi/wenet2/wenet/wenet/utils/init_model.py", line 174, in init_model rank5: infos = load_checkpoint(model, args.checkpoint) rank5: File "/data/run01/scz0arv/lindaodi/wenet2/wenet/wenet/utils/checkpoint.py", line 31, in load_checkpoint rank5: missing_keys, unexpected_keys = model.load_state_dict(checkpoint, rank5: File "/HOME/scz0arv/.conda/envs/wenet2/lib/python3.10/site-packages/torch/nn/modules/module.py", line 2189, in load_state_dict rank5: raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( rank5: RuntimeError: Error(s) in loading state_dict for Paraformer: rank5: size mismatch for decoder.output_layer.weight: copying a param with shape torch.Size([8404, 512]) from checkpoint, the shape in current model is torch.Size([5077, 512]). rank5: size mismatch for decoder.output_layer.bias: copying a param with shape torch.Size([8404]) from checkpoint, the shape in current model is torch.Size([5077]). rank5: size mismatch for ctc.ctc_lo.weight: copying a param with shape torch.Size([8404, 512]) from checkpoint, the shape in current model is torch.Size([5077, 512]). rank5: size mismatch for embed.weight: copying a param with shape torch.Size([8404, 512]) from checkpoint, the shape in current model is torch.Size([5077, 512]).

是还有地方没改到吗

Mddct commented 1 month ago

词典需要和模型维度保持一致

didi222-lqq @.***> 于2024年5月29日周三 11:57写道:

Closed #2542 https://github.com/wenet-e2e/wenet/issues/2542 as completed.

— Reply to this email directly, view it on GitHub https://github.com/wenet-e2e/wenet/issues/2542#event-12965475400, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABFN3Q33AV4AZDV3M7PMWVLZEVG3HAVCNFSM6AAAAABIN5ZN2OVHI2DSMVQWIX3LMV45UABCJFZXG5LFIV3GK3TUJZXXI2LGNFRWC5DJN5XDWMJSHE3DKNBXGU2DAMA . You are receiving this because you are subscribed to this thread.Message ID: @.***>