yeyupiaoling / MASR

Pytorch实现的流式与非流式的自动语音识别框架,同时兼容在线和离线识别,目前支持Conformer、Squeezeformer、DeepSpeech2模型,支持多种数据增强方法。
Apache License 2.0
563 stars 100 forks source link

下载了squeezeformer的best model之后导出报错 #70

Closed DannyWang920 closed 5 months ago

DannyWang920 commented 7 months ago

[2023-11-29 16:12:01 INFO ] utils:print_arguments:15 - ----------- 额外配置参数 ----------- [2023-11-29 16:12:01 INFO ] utils:print_arguments:17 - configs: configs/squeezeformer.yml [2023-11-29 16:12:01 INFO ] utils:print_arguments:17 - resume_model: models/squeezeformer_streaming_fbank/best_model/ [2023-11-29 16:12:01 INFO ] utils:print_arguments:17 - save_model: models/ [2023-11-29 16:12:01 INFO ] utils:print_arguments:17 - save_quant: False [2023-11-29 16:12:01 INFO ] utils:print_arguments:17 - use_gpu: True [2023-11-29 16:12:01 INFO ] utils:print_arguments:18 - ------------------------------------------------ [2023-11-29 16:12:01 INFO ] utils:print_arguments:20 - ----------- 配置文件参数 ----------- [2023-11-29 16:12:01 INFO ] utils:print_arguments:23 - ctc_beam_search_decoder_conf: [2023-11-29 16:12:01 INFO ] utils:print_arguments:30 - alpha: 2.2 [2023-11-29 16:12:01 INFO ] utils:print_arguments:30 - beam_size: 300 [2023-11-29 16:12:01 INFO ] utils:print_arguments:30 - beta: 4.3 [2023-11-29 16:12:01 INFO ] utils:print_arguments:30 - cutoff_prob: 0.99 [2023-11-29 16:12:01 INFO ] utils:print_arguments:30 - cutoff_top_n: 40 [2023-11-29 16:12:01 INFO ] utils:print_arguments:30 - language_model_path: lm/zh_giga.no_cna_cmn.prune01244.klm [2023-11-29 16:12:01 INFO ] utils:print_arguments:30 - num_processes: 10 [2023-11-29 16:12:01 INFO ] utils:print_arguments:23 - dataset_conf: [2023-11-29 16:12:01 INFO ] utils:print_arguments:30 - batch_size: 16 [2023-11-29 16:12:01 INFO ] utils:print_arguments:30 - dataset_vocab: dataset/vocabulary.txt [2023-11-29 16:12:01 INFO ] utils:print_arguments:30 - manifest_type: txt [2023-11-29 16:12:01 INFO ] utils:print_arguments:30 - max_duration: 20 [2023-11-29 16:12:01 INFO ] utils:print_arguments:30 - mean_istd_path: dataset/mean_istd.json [2023-11-29 16:12:01 INFO ] utils:print_arguments:30 - min_duration: 0.5 [2023-11-29 16:12:01 INFO ] utils:print_arguments:30 - noise_manifest_path: dataset/manifest.noise [2023-11-29 16:12:01 INFO ] utils:print_arguments:30 - num_workers: 4 [2023-11-29 16:12:01 INFO ] utils:print_arguments:30 - prefetch_factor: 2 [2023-11-29 16:12:01 INFO ] utils:print_arguments:30 - test_manifest: dataset/manifest.test [2023-11-29 16:12:01 INFO ] utils:print_arguments:30 - train_manifest: dataset/manifest.train [2023-11-29 16:12:01 INFO ] utils:print_arguments:32 - decoder: ctc_beam_search [2023-11-29 16:12:01 INFO ] utils:print_arguments:23 - decoder_conf: [2023-11-29 16:12:01 INFO ] utils:print_arguments:30 - attention_heads: 4 [2023-11-29 16:12:01 INFO ] utils:print_arguments:30 - dropout_rate: 0.1 [2023-11-29 16:12:01 INFO ] utils:print_arguments:30 - linear_units: 1024 [2023-11-29 16:12:01 INFO ] utils:print_arguments:30 - num_blocks: 3 [2023-11-29 16:12:01 INFO ] utils:print_arguments:30 - positional_dropout_rate: 0.1 [2023-11-29 16:12:01 INFO ] utils:print_arguments:30 - r_num_blocks: 3 [2023-11-29 16:12:01 INFO ] utils:print_arguments:30 - self_attention_dropout_rate: 0.1 [2023-11-29 16:12:01 INFO ] utils:print_arguments:30 - src_attention_dropout_rate: 0.1 [2023-11-29 16:12:01 INFO ] utils:print_arguments:23 - encoder_conf: [2023-11-29 16:12:01 INFO ] utils:print_arguments:30 - activation_type: swish [2023-11-29 16:12:01 INFO ] utils:print_arguments:30 - adaptive_scale: True [2023-11-29 16:12:01 INFO ] utils:print_arguments:30 - attention_dropout_rate: 0.1 [2023-11-29 16:12:01 INFO ] utils:print_arguments:30 - attention_heads: 4 [2023-11-29 16:12:01 INFO ] utils:print_arguments:30 - cnn_module_kernel: 31 [2023-11-29 16:12:01 INFO ] utils:print_arguments:30 - encoder_dim: 256 [2023-11-29 16:12:01 INFO ] utils:print_arguments:30 - feed_forward_dropout_rate: 0.1 [2023-11-29 16:12:01 INFO ] utils:print_arguments:30 - feed_forward_expansion_factor: 8 [2023-11-29 16:12:01 INFO ] utils:print_arguments:30 - input_dropout_rate: 0.1 [2023-11-29 16:12:01 INFO ] utils:print_arguments:30 - normalize_before: False [2023-11-29 16:12:01 INFO ] utils:print_arguments:30 - num_blocks: 12 [2023-11-29 16:12:01 INFO ] utils:print_arguments:30 - output_size: 256 [2023-11-29 16:12:01 INFO ] utils:print_arguments:30 - pos_enc_layer_type: rel_pos [2023-11-29 16:12:01 INFO ] utils:print_arguments:30 - recover_idx: 11 [2023-11-29 16:12:01 INFO ] utils:print_arguments:30 - reduce_idx: 5 [2023-11-29 16:12:01 INFO ] utils:print_arguments:32 - metrics_type: cer [2023-11-29 16:12:01 INFO ] utils:print_arguments:23 - model_conf: [2023-11-29 16:12:01 INFO ] utils:print_arguments:30 - ctc_weight: 0.3 [2023-11-29 16:12:01 INFO ] utils:print_arguments:30 - length_normalized_loss: False [2023-11-29 16:12:01 INFO ] utils:print_arguments:30 - lsm_weight: 0.1 [2023-11-29 16:12:01 INFO ] utils:print_arguments:30 - reverse_weight: 0.3 [2023-11-29 16:12:01 INFO ] utils:print_arguments:23 - optimizer_conf: [2023-11-29 16:12:01 INFO ] utils:print_arguments:30 - learning_rate: 0.001 [2023-11-29 16:12:01 INFO ] utils:print_arguments:30 - optimizer: AdamW [2023-11-29 16:12:01 INFO ] utils:print_arguments:30 - scheduler: NoamHoldAnnealing [2023-11-29 16:12:01 INFO ] utils:print_arguments:26 - scheduler_conf: [2023-11-29 16:12:01 INFO ] utils:print_arguments:28 - decay_rate: 1.0 [2023-11-29 16:12:01 INFO ] utils:print_arguments:28 - hold_ratio: 0.3 [2023-11-29 16:12:01 INFO ] utils:print_arguments:28 - max_steps: 87840 [2023-11-29 16:12:01 INFO ] utils:print_arguments:28 - min_lr: 1e-05 [2023-11-29 16:12:01 INFO ] utils:print_arguments:28 - warmup_ratio: 0.2 [2023-11-29 16:12:01 INFO ] utils:print_arguments:30 - weight_decay: 4e-05 [2023-11-29 16:12:01 INFO ] utils:print_arguments:23 - preprocess_conf: [2023-11-29 16:12:01 INFO ] utils:print_arguments:30 - feature_method: fbank [2023-11-29 16:12:01 INFO ] utils:print_arguments:30 - n_mels: 80 [2023-11-29 16:12:01 INFO ] utils:print_arguments:30 - n_mfcc: 40 [2023-11-29 16:12:01 INFO ] utils:print_arguments:30 - sample_rate: 16000 [2023-11-29 16:12:01 INFO ] utils:print_arguments:30 - target_dB: -20 [2023-11-29 16:12:01 INFO ] utils:print_arguments:30 - use_dB_normalization: True [2023-11-29 16:12:01 INFO ] utils:print_arguments:32 - streaming: True [2023-11-29 16:12:01 INFO ] utils:print_arguments:23 - train_conf: [2023-11-29 16:12:01 INFO ] utils:print_arguments:30 - accum_grad: 8 [2023-11-29 16:12:01 INFO ] utils:print_arguments:30 - enable_amp: False [2023-11-29 16:12:01 INFO ] utils:print_arguments:30 - grad_clip: 5.0 [2023-11-29 16:12:01 INFO ] utils:print_arguments:30 - log_interval: 100 [2023-11-29 16:12:01 INFO ] utils:print_arguments:30 - max_epoch: 200 [2023-11-29 16:12:01 INFO ] utils:print_arguments:32 - use_model: squeezeformer [2023-11-29 16:12:01 INFO ] utils:print_arguments:33 - ------------------------------------------------ [2023-11-29 16:12:01 WARNING] trainer:init:65 - Windows系统不支持多线程读取数据,已自动关闭! Traceback (most recent call last): File "E:/ASR/MASR-develop/export_model.py", line 22, in trainer.export(save_model_path=args.save_model, File "E:\ASR\MASR-develop\masr\trainer.py", line 618, in export self.model.load_state_dict(model_state_dict) File "D:\conda\envs\MASR\lib\site-packages\torch\nn\modules\module.py", line 1671, in load_state_dict raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( RuntimeError: Error(s) in loading state_dict for SqueezeformerModel: size mismatch for encoder.encoders.0.ffn1.w_1.weight: copying a param with shape torch.Size([1024, 256]) from checkpoint, the shape in current model is torch.Size([2048, 256]). size mismatch for encoder.encoders.0.ffn1.w_1.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([2048]). size mismatch for encoder.encoders.0.ffn1.w_2.weight: copying a param with shape torch.Size([256, 1024]) from checkpoint, the shape in current model is torch.Size([256, 2048]). size mismatch for encoder.encoders.0.ffn2.w_1.weight: copying a param with shape torch.Size([1024, 256]) from checkpoint, the shape in current model is torch.Size([2048, 256]). size mismatch for encoder.encoders.0.ffn2.w_1.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([2048]). size mismatch for encoder.encoders.0.ffn2.w_2.weight: copying a param with shape torch.Size([256, 1024]) from checkpoint, the shape in current model is torch.Size([256, 2048]). size mismatch for encoder.encoders.1.ffn1.w_1.weight: copying a param with shape torch.Size([1024, 256]) from checkpoint, the shape in current model is torch.Size([2048, 256]). size mismatch for encoder.encoders.1.ffn1.w_1.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([2048]). size mismatch for encoder.encoders.1.ffn1.w_2.weight: copying a param with shape torch.Size([256, 1024]) from checkpoint, the shape in current model is torch.Size([256, 2048]). size mismatch for encoder.encoders.1.ffn2.w_1.weight: copying a param with shape torch.Size([1024, 256]) from checkpoint, the shape in current model is torch.Size([2048, 256]). size mismatch for encoder.encoders.1.ffn2.w_1.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([2048]). size mismatch for encoder.encoders.1.ffn2.w_2.weight: copying a param with shape torch.Size([256, 1024]) from checkpoint, the shape in current model is torch.Size([256, 2048]). size mismatch for encoder.encoders.2.ffn1.w_1.weight: copying a param with shape torch.Size([1024, 256]) from checkpoint, the shape in current model is torch.Size([2048, 256]). size mismatch for encoder.encoders.2.ffn1.w_1.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([2048]). size mismatch for encoder.encoders.2.ffn1.w_2.weight: copying a param with shape torch.Size([256, 1024]) from checkpoint, the shape in current model is torch.Size([256, 2048]). size mismatch for encoder.encoders.2.ffn2.w_1.weight: copying a param with shape torch.Size([1024, 256]) from checkpoint, the shape in current model is torch.Size([2048, 256]). size mismatch for encoder.encoders.2.ffn2.w_1.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([2048]). size mismatch for encoder.encoders.2.ffn2.w_2.weight: copying a param with shape torch.Size([256, 1024]) from checkpoint, the shape in current model is torch.Size([256, 2048]). size mismatch for encoder.encoders.3.ffn1.w_1.weight: copying a param with shape torch.Size([1024, 256]) from checkpoint, the shape in current model is torch.Size([2048, 256]). size mismatch for encoder.encoders.3.ffn1.w_1.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([2048]). size mismatch for encoder.encoders.3.ffn1.w_2.weight: copying a param with shape torch.Size([256, 1024]) from checkpoint, the shape in current model is torch.Size([256, 2048]). size mismatch for encoder.encoders.3.ffn2.w_1.weight: copying a param with shape torch.Size([1024, 256]) from checkpoint, the shape in current model is torch.Size([2048, 256]). size mismatch for encoder.encoders.3.ffn2.w_1.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([2048]). size mismatch for encoder.encoders.3.ffn2.w_2.weight: copying a param with shape torch.Size([256, 1024]) from checkpoint, the shape in current model is torch.Size([256, 2048]). size mismatch for encoder.encoders.4.ffn1.w_1.weight: copying a param with shape torch.Size([1024, 256]) from checkpoint, the shape in current model is torch.Size([2048, 256]). size mismatch for encoder.encoders.4.ffn1.w_1.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([2048]). size mismatch for encoder.encoders.4.ffn1.w_2.weight: copying a param with shape torch.Size([256, 1024]) from checkpoint, the shape in current model is torch.Size([256, 2048]). size mismatch for encoder.encoders.4.ffn2.w_1.weight: copying a param with shape torch.Size([1024, 256]) from checkpoint, the shape in current model is torch.Size([2048, 256]). size mismatch for encoder.encoders.4.ffn2.w_1.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([2048]). size mismatch for encoder.encoders.4.ffn2.w_2.weight: copying a param with shape torch.Size([256, 1024]) from checkpoint, the shape in current model is torch.Size([256, 2048]). size mismatch for encoder.encoders.5.ffn1.w_1.weight: copying a param with shape torch.Size([1024, 256]) from checkpoint, the shape in current model is torch.Size([2048, 256]). size mismatch for encoder.encoders.5.ffn1.w_1.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([2048]). size mismatch for encoder.encoders.5.ffn1.w_2.weight: copying a param with shape torch.Size([256, 1024]) from checkpoint, the shape in current model is torch.Size([256, 2048]). size mismatch for encoder.encoders.5.ffn2.w_1.weight: copying a param with shape torch.Size([1024, 256]) from checkpoint, the shape in current model is torch.Size([2048, 256]). size mismatch for encoder.encoders.5.ffn2.w_1.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([2048]). size mismatch for encoder.encoders.5.ffn2.w_2.weight: copying a param with shape torch.Size([256, 1024]) from checkpoint, the shape in current model is torch.Size([256, 2048]). size mismatch for encoder.encoders.6.ffn1.w_1.weight: copying a param with shape torch.Size([1024, 256]) from checkpoint, the shape in current model is torch.Size([2048, 256]). size mismatch for encoder.encoders.6.ffn1.w_1.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([2048]). size mismatch for encoder.encoders.6.ffn1.w_2.weight: copying a param with shape torch.Size([256, 1024]) from checkpoint, the shape in current model is torch.Size([256, 2048]). size mismatch for encoder.encoders.6.ffn2.w_1.weight: copying a param with shape torch.Size([1024, 256]) from checkpoint, the shape in current model is torch.Size([2048, 256]). size mismatch for encoder.encoders.6.ffn2.w_1.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([2048]). size mismatch for encoder.encoders.6.ffn2.w_2.weight: copying a param with shape torch.Size([256, 1024]) from checkpoint, the shape in current model is torch.Size([256, 2048]). size mismatch for encoder.encoders.7.ffn1.w_1.weight: copying a param with shape torch.Size([1024, 256]) from checkpoint, the shape in current model is torch.Size([2048, 256]). size mismatch for encoder.encoders.7.ffn1.w_1.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([2048]). size mismatch for encoder.encoders.7.ffn1.w_2.weight: copying a param with shape torch.Size([256, 1024]) from checkpoint, the shape in current model is torch.Size([256, 2048]). size mismatch for encoder.encoders.7.ffn2.w_1.weight: copying a param with shape torch.Size([1024, 256]) from checkpoint, the shape in current model is torch.Size([2048, 256]). size mismatch for encoder.encoders.7.ffn2.w_1.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([2048]). size mismatch for encoder.encoders.7.ffn2.w_2.weight: copying a param with shape torch.Size([256, 1024]) from checkpoint, the shape in current model is torch.Size([256, 2048]). size mismatch for encoder.encoders.8.ffn1.w_1.weight: copying a param with shape torch.Size([1024, 256]) from checkpoint, the shape in current model is torch.Size([2048, 256]). size mismatch for encoder.encoders.8.ffn1.w_1.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([2048]). size mismatch for encoder.encoders.8.ffn1.w_2.weight: copying a param with shape torch.Size([256, 1024]) from checkpoint, the shape in current model is torch.Size([256, 2048]). size mismatch for encoder.encoders.8.ffn2.w_1.weight: copying a param with shape torch.Size([1024, 256]) from checkpoint, the shape in current model is torch.Size([2048, 256]). size mismatch for encoder.encoders.8.ffn2.w_1.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([2048]). size mismatch for encoder.encoders.8.ffn2.w_2.weight: copying a param with shape torch.Size([256, 1024]) from checkpoint, the shape in current model is torch.Size([256, 2048]). size mismatch for encoder.encoders.9.ffn1.w_1.weight: copying a param with shape torch.Size([1024, 256]) from checkpoint, the shape in current model is torch.Size([2048, 256]). size mismatch for encoder.encoders.9.ffn1.w_1.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([2048]). size mismatch for encoder.encoders.9.ffn1.w_2.weight: copying a param with shape torch.Size([256, 1024]) from checkpoint, the shape in current model is torch.Size([256, 2048]). size mismatch for encoder.encoders.9.ffn2.w_1.weight: copying a param with shape torch.Size([1024, 256]) from checkpoint, the shape in current model is torch.Size([2048, 256]). size mismatch for encoder.encoders.9.ffn2.w_1.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([2048]). size mismatch for encoder.encoders.9.ffn2.w_2.weight: copying a param with shape torch.Size([256, 1024]) from checkpoint, the shape in current model is torch.Size([256, 2048]). size mismatch for encoder.encoders.10.ffn1.w_1.weight: copying a param with shape torch.Size([1024, 256]) from checkpoint, the shape in current model is torch.Size([2048, 256]). size mismatch for encoder.encoders.10.ffn1.w_1.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([2048]). size mismatch for encoder.encoders.10.ffn1.w_2.weight: copying a param with shape torch.Size([256, 1024]) from checkpoint, the shape in current model is torch.Size([256, 2048]). size mismatch for encoder.encoders.10.ffn2.w_1.weight: copying a param with shape torch.Size([1024, 256]) from checkpoint, the shape in current model is torch.Size([2048, 256]). size mismatch for encoder.encoders.10.ffn2.w_1.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([2048]). size mismatch for encoder.encoders.10.ffn2.w_2.weight: copying a param with shape torch.Size([256, 1024]) from checkpoint, the shape in current model is torch.Size([256, 2048]). size mismatch for encoder.encoders.11.ffn1.w_1.weight: copying a param with shape torch.Size([1024, 256]) from checkpoint, the shape in current model is torch.Size([2048, 256]). size mismatch for encoder.encoders.11.ffn1.w_1.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([2048]). size mismatch for encoder.encoders.11.ffn1.w_2.weight: copying a param with shape torch.Size([256, 1024]) from checkpoint, the shape in current model is torch.Size([256, 2048]). size mismatch for encoder.encoders.11.ffn2.w_1.weight: copying a param with shape torch.Size([1024, 256]) from checkpoint, the shape in current model is torch.Size([2048, 256]). size mismatch for encoder.encoders.11.ffn2.w_1.bias: copying a param with shape torch.Size([1024]) from checkpoint, the shape in current model is torch.Size([2048]). size mismatch for encoder.encoders.11.ffn2.w_2.weight: copying a param with shape torch.Size([256, 1024]) from checkpoint, the shape in current model is torch.Size([256, 2048]). size mismatch for encoder.time_reduction_layer.dw_conv.weight: copying a param with shape torch.Size([256, 256, 5, 1]) from checkpoint, the shape in current model is torch.Size([256, 1, 1]). size mismatch for encoder.time_reduction_layer.pw_conv.weight: copying a param with shape torch.Size([256, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([256, 256, 1]).

Process finished with exit code 1

yeyupiaoling commented 6 months ago

这个是因为之前训练的模型结构不一样,所以导致参数权重加载出错,我已经提高导出的模型,你可以直接用。 如果要自己导出,只能重新训练模型了。