PaddlePaddle / PaddleSpeech

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
https://paddlespeech.readthedocs.io
Apache License 2.0
10.95k stars 1.83k forks source link

[S2T] FatalError: Erroneous arithmetic operation is detected by the operating system #2711

Closed navy7913 closed 1 year ago

navy7913 commented 1 year ago

您好,我跟着官网的s2t的安装教程,安装了中等的训练环境之后在examples/aishell/asr1的路径下执行bash run.sh --stage 1 --stop_stage 1时出现了以下错误,想请教您该如何解决这问题: 训练环境为ubuntu18.04.6 、CUDA Version: 10.2、cudnn:7.6.5 、paddlepaddle 2.3.1 、paddlepaddle-gpu 2.4.0rc0

LAUNCH INFO 2022-11-30 21:07:29,721 ------------------------- ERROR LOG DETAIL ------------------------- INFO | paddlespeech.s2t.utils.dynamic_import:instance_class:68 - Instance: Adam {'grad_clip': ClipGradByGlobalNormWithLog(global_clip_norm=5.0), 'weight_decay': <paddle.regularizer.L2Decay object at 0x7f3ebb882950>, 'learning_rate': WarmupLR(warmup_steps=25000, lr=0.002, last_epoch=0)}. 2022-11-30 21:07:28.175 | INFO | paddlespeech.s2t.training.optimizer:from_args:120 - LR: WarmupLR(warmup_steps=25000, lr=0.002, last_epoch=0) 2022-11-30 21:07:28.175 | INFO | paddlespeech.s2t.exps.u2.model:setup_model:308 - Setup optimizer/lr_scheduler! 2022-11-30 21:07:28.176 | INFO | paddlespeech.s2t.training.trainer:resume_or_scratch:221 - Init from scratch! 2022-11-30 21:07:28.590 | INFO | paddlespeech.s2t.utils.checkpoint:_save_parameters:286 - Saved model to exp/transformer/checkpoints/init.pdparams 2022-11-30 21:07:28.590 | INFO | paddlespeech.s2t.utils.checkpoint:_save_parameters:292 - Saved optimzier state to exp/transformer/checkpoints/init.pdopt 2022-11-30 21:07:28.591 | INFO | paddlespeech.s2t.exps.u2.model:do_train:161 - Train Total Examples: 15013 /home/navy/PaddleSpeech/paddlespeech/audio/transform/spec_augment.py:49: DeprecationWarning: BICUBIC is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BICUBIC instead. Image.BICUBIC) /home/navy/PaddleSpeech/paddlespeech/audio/transform/spec_augment.py:51: DeprecationWarning: BICUBIC is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BICUBIC instead. Image.BICUBIC) /home/navy/PaddleSpeech/paddlespeech/audio/transform/spec_augment.py:49: DeprecationWarning: BICUBIC is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BICUBIC instead. Image.BICUBIC) /home/navy/PaddleSpeech/paddlespeech/audio/transform/spec_augment.py:51: DeprecationWarning: BICUBIC is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BICUBIC instead. Image.BICUBIC)


C++ Traceback (most recent call last):

0 arange_ad_func(paddle::experimental::Tensor const&, paddle::experimental::Tensor const&, paddle::experimental::Tensor const&, paddle::experimental::DataType, phi::Place) 1 paddle::experimental::arange(paddle::experimental::Tensor const&, paddle::experimental::Tensor const&, paddle::experimental::Tensor const&, paddle::experimental::DataType, phi::Place const&) 2 void phi::ArangeKernel<long, phi::GPUContext>(phi::GPUContext const&, phi::DenseTensor const&, phi::DenseTensor const&, phi::DenseTensor const&, phi::DenseTensor*)


Error Message Summary:

FatalError: Erroneous arithmetic operation is detected by the operating system. [TimeInfo: Aborted at 1669813648 (unix time) try "date -d @1669813648" if you are using GNU date ] [SignalInfo: SIGFPE (@0x7f3f596b116d) received by PID 11056 (TID 0x7f3fd123c200) from PID 1500189037 ]

zxcd commented 1 year ago

是否能够提供完整的log,目前只看到了警告,不太好定位到具体问题出现的位置

zh794390558 commented 1 year ago

为什么安装两个版本的 paddle? paddlepaddle 2.3.1 、paddlepaddle-gpu 2.4.0rc0

navy7913 commented 1 year ago

是否能够提供完整的log,目前只看到了警告,不太好定位到具体问题出现的位置

2022-12-01 22:53:04.711 | DEBUG | paddlespeech.s2t::41 - register user softmax to paddle, remove this when fixed! 2022-12-01 22:53:04.711 | DEBUG | paddlespeech.s2t::45 - register user log_softmax to paddle, remove this when fixed! 2022-12-01 22:53:04.711 | DEBUG | paddlespeech.s2t::49 - register user sigmoid to paddle, remove this when fixed! 2022-12-01 22:53:04.711 | DEBUG | paddlespeech.s2t::53 - register user log_sigmoid to paddle, remove this when fixed! 2022-12-01 22:53:04.711 | DEBUG | paddlespeech.s2t::57 - register user relu to paddle, remove this when fixed! 2022-12-01 22:53:04.711 | DEBUG | paddlespeech.s2t::67 - override cat of paddle if exists or register, remove this when fixed! 2022-12-01 22:53:04.712 | DEBUG | paddlespeech.s2t::89 - override long of paddle.Tensor if exists or register, remove this when fixed! 2022-12-01 22:53:04.712 | DEBUG | paddlespeech.s2t::111 - override new_full of paddle.Tensor if exists or register, remove this when fixed! 2022-12-01 22:53:04.712 | DEBUG | paddlespeech.s2t::123 - override contiguous of paddle.Tensor if exists or register, remove this when fixed! 2022-12-01 22:53:04.712 | DEBUG | paddlespeech.s2t::134 - register user view to paddle.Tensor, remove this when fixed! 2022-12-01 22:53:04.712 | DEBUG | paddlespeech.s2t::145 - register user view_as to paddle.Tensor, remove this when fixed! 2022-12-01 22:53:04.712 | DEBUG | paddlespeech.s2t::186 - register user masked_fill to paddle.Tensor, remove this when fixed! 2022-12-01 22:53:04.713 | DEBUG | paddlespeech.s2t::205 - register user maskedfill to paddle.Tensor, remove this when fixed! 2022-12-01 22:53:04.713 | DEBUG | paddlespeech.s2t::229 - register user repeat to paddle.Tensor, remove this when fixed! 2022-12-01 22:53:04.713 | DEBUG | paddlespeech.s2t::235 - register user softmax to paddle.Tensor, remove this when fixed! 2022-12-01 22:53:04.713 | DEBUG | paddlespeech.s2t::240 - register user sigmoid to paddle.Tensor, remove this when fixed! 2022-12-01 22:53:04.713 | DEBUG | paddlespeech.s2t::244 - register user relu to paddle.Tensor, remove this when fixed! 2022-12-01 22:53:04.713 | DEBUG | paddlespeech.s2t::254 - register user type_as to paddle.Tensor, remove this when fixed! 2022-12-01 22:53:04.713 | DEBUG | paddlespeech.s2t::270 - register user to to paddle.Tensor, remove this when fixed! 2022-12-01 22:53:04.713 | DEBUG | paddlespeech.s2t::281 - register user float to paddle.Tensor, remove this when fixed! 2022-12-01 22:53:04.714 | DEBUG | paddlespeech.s2t::291 - register user int to paddle.Tensor, remove this when fixed! 2022-12-01 22:53:07.252 | INFO | paddlespeech.s2t.utils.utility:all_version:45 - Deps Module Version:[('python', '3.7.15 (default, Nov 7 2022, 22:00:21) \n[GCC 11.2.0]'), ('paddle', '2.4.0-rc0'), ('paddle_commit', '083853cd4e4a9bdad22c70fa48eb9a036d2def27'), ('soundfile', '0.11.0')] 2022-12-01 22:53:07.252 | INFO | paddlespeech.s2t.training.trainer:init:116 - Rank: 0/1 2022-12-01 22:53:08.981 | INFO | paddlespeech.s2t.io.batchfy:make_batchset:400 - count is auto detected as seq 2022-12-01 22:53:09.271 | INFO | paddlespeech.s2t.io.batchfy:make_batchset:424 - # utts: 120098 2022-12-01 22:53:09.316 | INFO | paddlespeech.s2t.io.batchfy:make_batchset:467 - # minibatches: 15013 2022-12-01 22:53:09.487 | WARNING | paddlespeech.s2t.io.reader:init:76 - [Experimental feature] Some preprocessing will be done for the mini-batch creation using Transformation( 0: LogMelSpectrogramKaldi(fs=16000, n_mels=80, n_frame_shift=10.0, n_frame_length=25.0, dither=0.1)) 1: GlobalCMVN( cmvn_path=data/mean_std.json, norm_means=True, norm_vars=True,) 2: TimeWarp(max_time_warp=5, inplace=True, mode=PIL) 3: FreqMask(F=30, n_mask=2, replace_with_zero=False, inplace=True) 4: TimeMask(T=40, n_mask=2, replace_with_zero=False, inplace=True)) 2022-12-01 22:53:09.641 | INFO | paddlespeech.s2t.io.batchfy:make_batchset:400 - count is auto detected as seq 2022-12-01 22:53:09.658 | INFO | paddlespeech.s2t.io.batchfy:make_batchset:424 - # utts: 14326 2022-12-01 22:53:09.662 | INFO | paddlespeech.s2t.io.batchfy:make_batchset:467 - # minibatches: 1791 2022-12-01 22:53:09.667 | WARNING | paddlespeech.s2t.io.reader:init:76 - [Experimental feature] Some preprocessing will be done for the mini-batch creation using Transformation( 0: LogMelSpectrogramKaldi(fs=16000, n_mels=80, n_frame_shift=10.0, n_frame_length=25.0, dither=0.1)) 1: GlobalCMVN( cmvn_path=data/mean_std.json, norm_means=True, norm_vars=True,) 2: TimeWarp(max_time_warp=5, inplace=True, mode=PIL) 3: FreqMask(F=30, n_mask=2, replace_with_zero=False, inplace=True) 4: TimeMask(T=40, n_mask=2, replace_with_zero=False, inplace=True)) 2022-12-01 22:53:09.667 | INFO | paddlespeech.s2t.exps.u2.model:setup_dataloader:233 - Setup train/valid Dataloader! 2022-12-01 22:53:09.668 | DEBUG | paddlespeech.s2t.models.u2.u2:_init_from_config:901 - U2 Encoder type: transformer 2022-12-01 22:53:09.801 | DEBUG | paddlespeech.s2t.models.u2.u2:_init_from_config:913 - U2 Decoder type: transformer 2022-12-01 22:53:09.910 | DEBUG | paddlespeech.s2t.modules.loss:init:41 - CTCLoss Loss reduction: sum, div-bs: True 2022-12-01 22:53:09.910 | DEBUG | paddlespeech.s2t.modules.loss:init:42 - CTCLoss Grad Norm Type: None 2022-12-01 22:53:09.911 | DEBUG | paddlespeech.s2t.modules.loss:init:74 - CTCLoss() kwargs:{'norm_by_times': False}, not support: {'norm_by_batchsize': False, 'norm_by_total_logits_len': False} 2022-12-01 22:53:09.914 | INFO | paddlespeech.s2t.exps.u2.model:setup_model:259 - U2Model( (encoder): TransformerEncoder( (embed): Conv2dSubsampling4( (pos_enc): PositionalEncoding( (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) ) (conv): Sequential( (0): Conv2D(1, 256, kernel_size=[3, 3], stride=[2, 2], data_format=NCHW) (1): ReLU() (2): Conv2D(256, 256, kernel_size=[3, 3], stride=[2, 2], data_format=NCHW) (3): ReLU() ) (out): Sequential( (0): Linear(in_features=4864, out_features=256, dtype=float32) ) ) (after_norm): LayerNorm(normalized_shape=[256], epsilon=1e-12) (encoders): LayerList( (0): TransformerEncoderLayer( (self_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (feed_forward): PositionwiseFeedForward( (w_1): Linear(in_features=256, out_features=2048, dtype=float32) (activation): ReLU() (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (w_2): Linear(in_features=2048, out_features=256, dtype=float32) ) (norm1): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm2): LayerNorm(normalized_shape=[256], epsilon=1e-12) (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (concat_linear): Linear(in_features=512, out_features=256, dtype=float32) ) (1): TransformerEncoderLayer( (self_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (feed_forward): PositionwiseFeedForward( (w_1): Linear(in_features=256, out_features=2048, dtype=float32) (activation): ReLU() (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (w_2): Linear(in_features=2048, out_features=256, dtype=float32) ) (norm1): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm2): LayerNorm(normalized_shape=[256], epsilon=1e-12) (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (concat_linear): Linear(in_features=512, out_features=256, dtype=float32) ) (2): TransformerEncoderLayer( (self_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (feed_forward): PositionwiseFeedForward( (w_1): Linear(in_features=256, out_features=2048, dtype=float32) (activation): ReLU() (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (w_2): Linear(in_features=2048, out_features=256, dtype=float32) ) (norm1): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm2): LayerNorm(normalized_shape=[256], epsilon=1e-12) (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (concat_linear): Linear(in_features=512, out_features=256, dtype=float32) ) (3): TransformerEncoderLayer( (self_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (feed_forward): PositionwiseFeedForward( (w_1): Linear(in_features=256, out_features=2048, dtype=float32) (activation): ReLU() (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (w_2): Linear(in_features=2048, out_features=256, dtype=float32) ) (norm1): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm2): LayerNorm(normalized_shape=[256], epsilon=1e-12) (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (concat_linear): Linear(in_features=512, out_features=256, dtype=float32) ) (4): TransformerEncoderLayer( (self_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (feed_forward): PositionwiseFeedForward( (w_1): Linear(in_features=256, out_features=2048, dtype=float32) (activation): ReLU() (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (w_2): Linear(in_features=2048, out_features=256, dtype=float32) ) (norm1): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm2): LayerNorm(normalized_shape=[256], epsilon=1e-12) (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (concat_linear): Linear(in_features=512, out_features=256, dtype=float32) ) (5): TransformerEncoderLayer( (self_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (feed_forward): PositionwiseFeedForward( (w_1): Linear(in_features=256, out_features=2048, dtype=float32) (activation): ReLU() (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (w_2): Linear(in_features=2048, out_features=256, dtype=float32) ) (norm1): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm2): LayerNorm(normalized_shape=[256], epsilon=1e-12) (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (concat_linear): Linear(in_features=512, out_features=256, dtype=float32) ) (6): TransformerEncoderLayer( (self_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (feed_forward): PositionwiseFeedForward( (w_1): Linear(in_features=256, out_features=2048, dtype=float32) (activation): ReLU() (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (w_2): Linear(in_features=2048, out_features=256, dtype=float32) ) (norm1): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm2): LayerNorm(normalized_shape=[256], epsilon=1e-12) (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (concat_linear): Linear(in_features=512, out_features=256, dtype=float32) ) (7): TransformerEncoderLayer( (self_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (feed_forward): PositionwiseFeedForward( (w_1): Linear(in_features=256, out_features=2048, dtype=float32) (activation): ReLU() (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (w_2): Linear(in_features=2048, out_features=256, dtype=float32) ) (norm1): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm2): LayerNorm(normalized_shape=[256], epsilon=1e-12) (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (concat_linear): Linear(in_features=512, out_features=256, dtype=float32) ) (8): TransformerEncoderLayer( (self_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (feed_forward): PositionwiseFeedForward( (w_1): Linear(in_features=256, out_features=2048, dtype=float32) (activation): ReLU() (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (w_2): Linear(in_features=2048, out_features=256, dtype=float32) ) (norm1): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm2): LayerNorm(normalized_shape=[256], epsilon=1e-12) (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (concat_linear): Linear(in_features=512, out_features=256, dtype=float32) ) (9): TransformerEncoderLayer( (self_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (feed_forward): PositionwiseFeedForward( (w_1): Linear(in_features=256, out_features=2048, dtype=float32) (activation): ReLU() (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (w_2): Linear(in_features=2048, out_features=256, dtype=float32) ) (norm1): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm2): LayerNorm(normalized_shape=[256], epsilon=1e-12) (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (concat_linear): Linear(in_features=512, out_features=256, dtype=float32) ) (10): TransformerEncoderLayer( (self_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (feed_forward): PositionwiseFeedForward( (w_1): Linear(in_features=256, out_features=2048, dtype=float32) (activation): ReLU() (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (w_2): Linear(in_features=2048, out_features=256, dtype=float32) ) (norm1): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm2): LayerNorm(normalized_shape=[256], epsilon=1e-12) (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (concat_linear): Linear(in_features=512, out_features=256, dtype=float32) ) (11): TransformerEncoderLayer( (self_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (feed_forward): PositionwiseFeedForward( (w_1): Linear(in_features=256, out_features=2048, dtype=float32) (activation): ReLU() (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (w_2): Linear(in_features=2048, out_features=256, dtype=float32) ) (norm1): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm2): LayerNorm(normalized_shape=[256], epsilon=1e-12) (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (concat_linear): Linear(in_features=512, out_features=256, dtype=float32) ) ) ) (decoder): TransformerDecoder( (embed): Sequential( (0): Embedding(4233, 256, sparse=False) (1): PositionalEncoding( (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) ) ) (after_norm): LayerNorm(normalized_shape=[256], epsilon=1e-12) (output_layer): Linear(in_features=256, out_features=4233, dtype=float32) (decoders): LayerList( (0): DecoderLayer( (self_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (src_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (feed_forward): PositionwiseFeedForward( (w_1): Linear(in_features=256, out_features=2048, dtype=float32) (activation): ReLU() (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (w_2): Linear(in_features=2048, out_features=256, dtype=float32) ) (norm1): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm2): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm3): LayerNorm(normalized_shape=[256], epsilon=1e-12) (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (concat_linear1): Linear(in_features=512, out_features=256, dtype=float32) (concat_linear2): Linear(in_features=512, out_features=256, dtype=float32) ) (1): DecoderLayer( (self_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (src_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (feed_forward): PositionwiseFeedForward( (w_1): Linear(in_features=256, out_features=2048, dtype=float32) (activation): ReLU() (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (w_2): Linear(in_features=2048, out_features=256, dtype=float32) ) (norm1): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm2): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm3): LayerNorm(normalized_shape=[256], epsilon=1e-12) (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (concat_linear1): Linear(in_features=512, out_features=256, dtype=float32) (concat_linear2): Linear(in_features=512, out_features=256, dtype=float32) ) (2): DecoderLayer( (self_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (src_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (feed_forward): PositionwiseFeedForward( (w_1): Linear(in_features=256, out_features=2048, dtype=float32) (activation): ReLU() (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (w_2): Linear(in_features=2048, out_features=256, dtype=float32) ) (norm1): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm2): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm3): LayerNorm(normalized_shape=[256], epsilon=1e-12) (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (concat_linear1): Linear(in_features=512, out_features=256, dtype=float32) (concat_linear2): Linear(in_features=512, out_features=256, dtype=float32) ) (3): DecoderLayer( (self_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (src_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (feed_forward): PositionwiseFeedForward( (w_1): Linear(in_features=256, out_features=2048, dtype=float32) (activation): ReLU() (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (w_2): Linear(in_features=2048, out_features=256, dtype=float32) ) (norm1): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm2): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm3): LayerNorm(normalized_shape=[256], epsilon=1e-12) (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (concat_linear1): Linear(in_features=512, out_features=256, dtype=float32) (concat_linear2): Linear(in_features=512, out_features=256, dtype=float32) ) (4): DecoderLayer( (self_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (src_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (feed_forward): PositionwiseFeedForward( (w_1): Linear(in_features=256, out_features=2048, dtype=float32) (activation): ReLU() (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (w_2): Linear(in_features=2048, out_features=256, dtype=float32) ) (norm1): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm2): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm3): LayerNorm(normalized_shape=[256], epsilon=1e-12) (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (concat_linear1): Linear(in_features=512, out_features=256, dtype=float32) (concat_linear2): Linear(in_features=512, out_features=256, dtype=float32) ) (5): DecoderLayer( (self_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (src_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (feed_forward): PositionwiseFeedForward( (w_1): Linear(in_features=256, out_features=2048, dtype=float32) (activation): ReLU() (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (w_2): Linear(in_features=2048, out_features=256, dtype=float32) ) (norm1): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm2): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm3): LayerNorm(normalized_shape=[256], epsilon=1e-12) (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (concat_linear1): Linear(in_features=512, out_features=256, dtype=float32) (concat_linear2): Linear(in_features=512, out_features=256, dtype=float32) ) ) ) (ctc): CTCDecoderBase( (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) (ctc_lo): Linear(in_features=256, out_features=4233, dtype=float32) (criterion): CTCLoss( (loss): CTCLoss() ) ) (criterion_att): LabelSmoothingLoss( (criterion): KLDivLoss() ) ) 2022-12-01 22:53:09.915 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.embed.conv.0.weight | [256, 1, 3, 3] | 2304 | True 2022-12-01 22:53:09.915 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.embed.conv.0.bias | [256] | 256 | True 2022-12-01 22:53:09.915 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.embed.conv.2.weight | [256, 256, 3, 3] | 589824 | True 2022-12-01 22:53:09.916 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.embed.conv.2.bias | [256] | 256 | True 2022-12-01 22:53:09.916 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.embed.out.0.weight | [4864, 256] | 1245184 | True 2022-12-01 22:53:09.916 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.embed.out.0.bias | [256] | 256 | True 2022-12-01 22:53:09.916 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.after_norm.weight | [256] | 256 | True 2022-12-01 22:53:09.917 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.after_norm.bias | [256] | 256 | True 2022-12-01 22:53:09.917 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.0.self_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-01 22:53:09.918 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.0.self_attn.linear_q.bias | [256] | 256 | True 2022-12-01 22:53:09.918 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.0.self_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-01 22:53:09.918 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.0.self_attn.linear_k.bias | [256] | 256 | True 2022-12-01 22:53:09.919 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.0.self_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-01 22:53:09.919 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.0.self_attn.linear_v.bias | [256] | 256 | True 2022-12-01 22:53:09.919 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.0.self_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-01 22:53:09.919 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.0.self_attn.linear_out.bias | [256] | 256 | True 2022-12-01 22:53:09.920 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.0.feed_forward.w_1.weight | [256, 2048] | 524288 | True 2022-12-01 22:53:09.920 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.0.feed_forward.w_1.bias | [2048] | 2048 | True 2022-12-01 22:53:09.921 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.0.feed_forward.w_2.weight | [2048, 256] | 524288 | True 2022-12-01 22:53:09.921 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.0.feed_forward.w_2.bias | [256] | 256 | True 2022-12-01 22:53:09.921 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.0.norm1.weight | [256] | 256 | True 2022-12-01 22:53:09.922 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.0.norm1.bias | [256] | 256 | True 2022-12-01 22:53:09.922 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.0.norm2.weight | [256] | 256 | True 2022-12-01 22:53:09.922 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.0.norm2.bias | [256] | 256 | True 2022-12-01 22:53:09.923 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.0.concat_linear.weight | [512, 256] | 131072 | True 2022-12-01 22:53:09.923 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.0.concat_linear.bias | [256] | 256 | True 2022-12-01 22:53:09.923 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.1.self_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-01 22:53:09.924 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.1.self_attn.linear_q.bias | [256] | 256 | True 2022-12-01 22:53:09.924 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.1.self_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-01 22:53:09.924 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.1.self_attn.linear_k.bias | [256] | 256 | True 2022-12-01 22:53:09.925 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.1.self_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-01 22:53:09.925 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.1.self_attn.linear_v.bias | [256] | 256 | True 2022-12-01 22:53:09.925 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.1.self_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-01 22:53:09.926 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.1.self_attn.linear_out.bias | [256] | 256 | True 2022-12-01 22:53:09.926 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.1.feed_forward.w_1.weight | [256, 2048] | 524288 | True 2022-12-01 22:53:09.926 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.1.feed_forward.w_1.bias | [2048] | 2048 | True 2022-12-01 22:53:09.927 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.1.feed_forward.w_2.weight | [2048, 256] | 524288 | True 2022-12-01 22:53:09.927 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.1.feed_forward.w_2.bias | [256] | 256 | True 2022-12-01 22:53:09.927 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.1.norm1.weight | [256] | 256 | True 2022-12-01 22:53:09.928 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.1.norm1.bias | [256] | 256 | True 2022-12-01 22:53:09.928 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.1.norm2.weight | [256] | 256 | True 2022-12-01 22:53:09.928 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.1.norm2.bias | [256] | 256 | True 2022-12-01 22:53:09.929 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.1.concat_linear.weight | [512, 256] | 131072 | True 2022-12-01 22:53:09.929 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.1.concat_linear.bias | [256] | 256 | True 2022-12-01 22:53:09.929 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.2.self_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-01 22:53:09.929 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.2.self_attn.linear_q.bias | [256] | 256 | True 2022-12-01 22:53:09.930 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.2.self_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-01 22:53:09.930 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.2.self_attn.linear_k.bias | [256] | 256 | True 2022-12-01 22:53:09.930 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.2.self_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-01 22:53:09.931 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.2.self_attn.linear_v.bias | [256] | 256 | True 2022-12-01 22:53:09.931 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.2.self_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-01 22:53:09.931 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.2.self_attn.linear_out.bias | [256] | 256 | True 2022-12-01 22:53:09.932 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.2.feed_forward.w_1.weight | [256, 2048] | 524288 | True 2022-12-01 22:53:09.932 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.2.feed_forward.w_1.bias | [2048] | 2048 | True 2022-12-01 22:53:09.932 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.2.feed_forward.w_2.weight | [2048, 256] | 524288 | True 2022-12-01 22:53:09.933 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.2.feed_forward.w_2.bias | [256] | 256 | True 2022-12-01 22:53:09.933 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.2.norm1.weight | [256] | 256 | True 2022-12-01 22:53:09.933 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.2.norm1.bias | [256] | 256 | True 2022-12-01 22:53:09.934 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.2.norm2.weight | [256] | 256 | True 2022-12-01 22:53:09.934 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.2.norm2.bias | [256] | 256 | True 2022-12-01 22:53:09.934 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.2.concat_linear.weight | [512, 256] | 131072 | True 2022-12-01 22:53:09.934 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.2.concat_linear.bias | [256] | 256 | True 2022-12-01 22:53:09.935 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.3.self_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-01 22:53:09.935 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.3.self_attn.linear_q.bias | [256] | 256 | True 2022-12-01 22:53:09.935 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.3.self_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-01 22:53:09.936 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.3.self_attn.linear_k.bias | [256] | 256 | True 2022-12-01 22:53:09.936 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.3.self_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-01 22:53:09.937 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.3.self_attn.linear_v.bias | [256] | 256 | True 2022-12-01 22:53:09.937 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.3.self_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-01 22:53:09.937 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.3.self_attn.linear_out.bias | [256] | 256 | True 2022-12-01 22:53:09.937 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.3.feed_forward.w_1.weight | [256, 2048] | 524288 | True 2022-12-01 22:53:09.938 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.3.feed_forward.w_1.bias | [2048] | 2048 | True 2022-12-01 22:53:09.938 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.3.feed_forward.w_2.weight | [2048, 256] | 524288 | True 2022-12-01 22:53:09.938 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.3.feed_forward.w_2.bias | [256] | 256 | True 2022-12-01 22:53:09.939 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.3.norm1.weight | [256] | 256 | True 2022-12-01 22:53:09.939 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.3.norm1.bias | [256] | 256 | True 2022-12-01 22:53:09.939 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.3.norm2.weight | [256] | 256 | True 2022-12-01 22:53:09.940 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.3.norm2.bias | [256] | 256 | True 2022-12-01 22:53:09.940 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.3.concat_linear.weight | [512, 256] | 131072 | True 2022-12-01 22:53:09.940 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.3.concat_linear.bias | [256] | 256 | True 2022-12-01 22:53:09.941 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.4.self_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-01 22:53:09.941 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.4.self_attn.linear_q.bias | [256] | 256 | True 2022-12-01 22:53:09.941 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.4.self_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-01 22:53:09.942 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.4.self_attn.linear_k.bias | [256] | 256 | True 2022-12-01 22:53:09.942 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.4.self_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-01 22:53:09.942 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.4.self_attn.linear_v.bias | [256] | 256 | True 2022-12-01 22:53:09.942 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.4.self_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-01 22:53:09.943 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.4.self_attn.linear_out.bias | [256] | 256 | True 2022-12-01 22:53:09.943 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.4.feed_forward.w_1.weight | [256, 2048] | 524288 | True 2022-12-01 22:53:09.943 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.4.feed_forward.w_1.bias | [2048] | 2048 | True 2022-12-01 22:53:09.944 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.4.feed_forward.w_2.weight | [2048, 256] | 524288 | True 2022-12-01 22:53:09.944 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.4.feed_forward.w_2.bias | [256] | 256 | True 2022-12-01 22:53:09.944 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.4.norm1.weight | [256] | 256 | True 2022-12-01 22:53:09.945 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.4.norm1.bias | [256] | 256 | True 2022-12-01 22:53:09.945 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.4.norm2.weight | [256] | 256 | True 2022-12-01 22:53:09.945 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.4.norm2.bias | [256] | 256 | True 2022-12-01 22:53:09.945 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.4.concat_linear.weight | [512, 256] | 131072 | True 2022-12-01 22:53:09.946 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.4.concat_linear.bias | [256] | 256 | True 2022-12-01 22:53:09.946 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.5.self_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-01 22:53:09.946 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.5.self_attn.linear_q.bias | [256] | 256 | True 2022-12-01 22:53:09.947 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.5.self_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-01 22:53:09.947 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.5.self_attn.linear_k.bias | [256] | 256 | True 2022-12-01 22:53:09.947 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.5.self_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-01 22:53:09.948 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.5.self_attn.linear_v.bias | [256] | 256 | True 2022-12-01 22:53:09.948 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.5.self_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-01 22:53:09.948 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.5.self_attn.linear_out.bias | [256] | 256 | True 2022-12-01 22:53:09.949 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.5.feed_forward.w_1.weight | [256, 2048] | 524288 | True 2022-12-01 22:53:09.949 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.5.feed_forward.w_1.bias | [2048] | 2048 | True 2022-12-01 22:53:09.949 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.5.feed_forward.w_2.weight | [2048, 256] | 524288 | True 2022-12-01 22:53:09.949 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.5.feed_forward.w_2.bias | [256] | 256 | True 2022-12-01 22:53:09.950 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.5.norm1.weight | [256] | 256 | True 2022-12-01 22:53:09.950 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.5.norm1.bias | [256] | 256 | True 2022-12-01 22:53:09.950 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.5.norm2.weight | [256] | 256 | True 2022-12-01 22:53:09.951 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.5.norm2.bias | [256] | 256 | True 2022-12-01 22:53:09.951 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.5.concat_linear.weight | [512, 256] | 131072 | True 2022-12-01 22:53:09.951 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.5.concat_linear.bias | [256] | 256 | True 2022-12-01 22:53:09.952 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.6.self_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-01 22:53:09.952 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.6.self_attn.linear_q.bias | [256] | 256 | True 2022-12-01 22:53:09.952 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.6.self_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-01 22:53:09.953 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.6.self_attn.linear_k.bias | [256] | 256 | True 2022-12-01 22:53:09.953 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.6.self_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-01 22:53:09.953 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.6.self_attn.linear_v.bias | [256] | 256 | True 2022-12-01 22:53:09.953 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.6.self_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-01 22:53:09.954 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.6.self_attn.linear_out.bias | [256] | 256 | True 2022-12-01 22:53:09.954 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.6.feed_forward.w_1.weight | [256, 2048] | 524288 | True 2022-12-01 22:53:09.954 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.6.feed_forward.w_1.bias | [2048] | 2048 | True 2022-12-01 22:53:09.954 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.6.feed_forward.w_2.weight | [2048, 256] | 524288 | True 2022-12-01 22:53:09.955 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.6.feed_forward.w_2.bias | [256] | 256 | True 2022-12-01 22:53:09.955 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.6.norm1.weight | [256] | 256 | True 2022-12-01 22:53:09.955 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.6.norm1.bias | [256] | 256 | True 2022-12-01 22:53:09.956 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.6.norm2.weight | [256] | 256 | True 2022-12-01 22:53:09.956 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.6.norm2.bias | [256] | 256 | True 2022-12-01 22:53:09.956 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.6.concat_linear.weight | [512, 256] | 131072 | True 2022-12-01 22:53:09.957 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.6.concat_linear.bias | [256] | 256 | True 2022-12-01 22:53:09.957 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.7.self_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-01 22:53:09.957 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.7.self_attn.linear_q.bias | [256] | 256 | True 2022-12-01 22:53:09.957 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.7.self_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-01 22:53:09.958 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.7.self_attn.linear_k.bias | [256] | 256 | True 2022-12-01 22:53:09.958 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.7.self_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-01 22:53:09.958 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.7.self_attn.linear_v.bias | [256] | 256 | True 2022-12-01 22:53:09.959 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.7.self_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-01 22:53:09.959 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.7.self_attn.linear_out.bias | [256] | 256 | True 2022-12-01 22:53:09.959 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.7.feed_forward.w_1.weight | [256, 2048] | 524288 | True 2022-12-01 22:53:09.960 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.7.feed_forward.w_1.bias | [2048] | 2048 | True 2022-12-01 22:53:09.960 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.7.feed_forward.w_2.weight | [2048, 256] | 524288 | True 2022-12-01 22:53:09.961 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.7.feed_forward.w_2.bias | [256] | 256 | True 2022-12-01 22:53:09.961 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.7.norm1.weight | [256] | 256 | True 2022-12-01 22:53:09.961 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.7.norm1.bias | [256] | 256 | True 2022-12-01 22:53:09.962 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.7.norm2.weight | [256] | 256 | True 2022-12-01 22:53:09.962 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.7.norm2.bias | [256] | 256 | True 2022-12-01 22:53:09.962 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.7.concat_linear.weight | [512, 256] | 131072 | True 2022-12-01 22:53:09.963 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.7.concat_linear.bias | [256] | 256 | True 2022-12-01 22:53:09.963 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.8.self_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-01 22:53:09.963 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.8.self_attn.linear_q.bias | [256] | 256 | True 2022-12-01 22:53:09.964 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.8.self_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-01 22:53:09.964 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.8.self_attn.linear_k.bias | [256] | 256 | True 2022-12-01 22:53:09.965 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.8.self_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-01 22:53:09.965 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.8.self_attn.linear_v.bias | [256] | 256 | True 2022-12-01 22:53:09.965 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.8.self_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-01 22:53:09.965 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.8.self_attn.linear_out.bias | [256] | 256 | True 2022-12-01 22:53:09.966 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.8.feed_forward.w_1.weight | [256, 2048] | 524288 | True 2022-12-01 22:53:09.966 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.8.feed_forward.w_1.bias | [2048] | 2048 | True 2022-12-01 22:53:09.967 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.8.feed_forward.w_2.weight | [2048, 256] | 524288 | True 2022-12-01 22:53:09.967 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.8.feed_forward.w_2.bias | [256] | 256 | True 2022-12-01 22:53:09.967 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.8.norm1.weight | [256] | 256 | True 2022-12-01 22:53:09.968 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.8.norm1.bias | [256] | 256 | True 2022-12-01 22:53:09.968 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.8.norm2.weight | [256] | 256 | True 2022-12-01 22:53:09.968 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.8.norm2.bias | [256] | 256 | True 2022-12-01 22:53:09.969 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.8.concat_linear.weight | [512, 256] | 131072 | True 2022-12-01 22:53:09.969 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.8.concat_linear.bias | [256] | 256 | True 2022-12-01 22:53:09.969 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.9.self_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-01 22:53:09.970 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.9.self_attn.linear_q.bias | [256] | 256 | True 2022-12-01 22:53:09.970 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.9.self_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-01 22:53:09.971 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.9.self_attn.linear_k.bias | [256] | 256 | True 2022-12-01 22:53:09.971 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.9.self_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-01 22:53:09.971 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.9.self_attn.linear_v.bias | [256] | 256 | True 2022-12-01 22:53:09.972 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.9.self_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-01 22:53:09.972 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.9.self_attn.linear_out.bias | [256] | 256 | True 2022-12-01 22:53:09.972 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.9.feed_forward.w_1.weight | [256, 2048] | 524288 | True 2022-12-01 22:53:09.973 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.9.feed_forward.w_1.bias | [2048] | 2048 | True 2022-12-01 22:53:09.973 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.9.feed_forward.w_2.weight | [2048, 256] | 524288 | True 2022-12-01 22:53:09.973 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.9.feed_forward.w_2.bias | [256] | 256 | True 2022-12-01 22:53:09.974 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.9.norm1.weight | [256] | 256 | True 2022-12-01 22:53:09.974 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.9.norm1.bias | [256] | 256 | True 2022-12-01 22:53:09.975 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.9.norm2.weight | [256] | 256 | True 2022-12-01 22:53:09.975 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.9.norm2.bias | [256] | 256 | True 2022-12-01 22:53:09.975 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.9.concat_linear.weight | [512, 256] | 131072 | True 2022-12-01 22:53:09.976 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.9.concat_linear.bias | [256] | 256 | True 2022-12-01 22:53:09.976 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.10.self_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-01 22:53:09.976 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.10.self_attn.linear_q.bias | [256] | 256 | True 2022-12-01 22:53:09.977 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.10.self_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-01 22:53:09.977 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.10.self_attn.linear_k.bias | [256] | 256 | True 2022-12-01 22:53:09.977 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.10.self_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-01 22:53:09.978 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.10.self_attn.linear_v.bias | [256] | 256 | True 2022-12-01 22:53:09.978 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.10.self_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-01 22:53:09.978 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.10.self_attn.linear_out.bias | [256] | 256 | True 2022-12-01 22:53:09.979 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.10.feed_forward.w_1.weight | [256, 2048] | 524288 | True 2022-12-01 22:53:09.979 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.10.feed_forward.w_1.bias | [2048] | 2048 | True 2022-12-01 22:53:09.979 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.10.feed_forward.w_2.weight | [2048, 256] | 524288 | True 2022-12-01 22:53:09.980 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.10.feed_forward.w_2.bias | [256] | 256 | True 2022-12-01 22:53:09.980 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.10.norm1.weight | [256] | 256 | True 2022-12-01 22:53:09.980 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.10.norm1.bias | [256] | 256 | True 2022-12-01 22:53:09.981 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.10.norm2.weight | [256] | 256 | True 2022-12-01 22:53:09.981 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.10.norm2.bias | [256] | 256 | True 2022-12-01 22:53:09.981 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.10.concat_linear.weight | [512, 256] | 131072 | True 2022-12-01 22:53:09.982 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.10.concat_linear.bias | [256] | 256 | True 2022-12-01 22:53:09.982 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.11.self_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-01 22:53:09.983 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.11.self_attn.linear_q.bias | [256] | 256 | True 2022-12-01 22:53:09.983 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.11.self_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-01 22:53:09.983 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.11.self_attn.linear_k.bias | [256] | 256 | True 2022-12-01 22:53:09.983 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.11.self_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-01 22:53:09.984 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.11.self_attn.linear_v.bias | [256] | 256 | True 2022-12-01 22:53:09.984 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.11.self_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-01 22:53:09.985 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.11.self_attn.linear_out.bias | [256] | 256 | True 2022-12-01 22:53:09.985 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.11.feed_forward.w_1.weight | [256, 2048] | 524288 | True 2022-12-01 22:53:09.985 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.11.feed_forward.w_1.bias | [2048] | 2048 | True 2022-12-01 22:53:09.986 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.11.feed_forward.w_2.weight | [2048, 256] | 524288 | True 2022-12-01 22:53:09.986 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.11.feed_forward.w_2.bias | [256] | 256 | True 2022-12-01 22:53:09.986 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.11.norm1.weight | [256] | 256 | True 2022-12-01 22:53:09.987 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.11.norm1.bias | [256] | 256 | True 2022-12-01 22:53:09.987 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.11.norm2.weight | [256] | 256 | True 2022-12-01 22:53:09.987 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.11.norm2.bias | [256] | 256 | True 2022-12-01 22:53:09.988 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.11.concat_linear.weight | [512, 256] | 131072 | True 2022-12-01 22:53:09.988 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.11.concat_linear.bias | [256] | 256 | True 2022-12-01 22:53:09.988 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.embed.0.weight | [4233, 256] | 1083648 | True 2022-12-01 22:53:09.989 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.after_norm.weight | [256] | 256 | True 2022-12-01 22:53:09.989 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.after_norm.bias | [256] | 256 | True 2022-12-01 22:53:09.989 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.output_layer.weight | [256, 4233] | 1083648 | True 2022-12-01 22:53:09.990 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.output_layer.bias | [4233] | 4233 | True 2022-12-01 22:53:09.990 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.self_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-01 22:53:09.991 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.self_attn.linear_q.bias | [256] | 256 | True 2022-12-01 22:53:09.991 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.self_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-01 22:53:09.992 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.self_attn.linear_k.bias | [256] | 256 | True 2022-12-01 22:53:09.992 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.self_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-01 22:53:09.992 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.self_attn.linear_v.bias | [256] | 256 | True 2022-12-01 22:53:09.993 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.self_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-01 22:53:09.993 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.self_attn.linear_out.bias | [256] | 256 | True 2022-12-01 22:53:09.994 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.src_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-01 22:53:09.994 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.src_attn.linear_q.bias | [256] | 256 | True 2022-12-01 22:53:09.995 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.src_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-01 22:53:09.995 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.src_attn.linear_k.bias | [256] | 256 | True 2022-12-01 22:53:09.995 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.src_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-01 22:53:09.996 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.src_attn.linear_v.bias | [256] | 256 | True 2022-12-01 22:53:09.996 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.src_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-01 22:53:09.997 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.src_attn.linear_out.bias | [256] | 256 | True 2022-12-01 22:53:09.997 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.feed_forward.w_1.weight | [256, 2048] | 524288 | True 2022-12-01 22:53:09.997 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.feed_forward.w_1.bias | [2048] | 2048 | True 2022-12-01 22:53:09.998 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.feed_forward.w_2.weight | [2048, 256] | 524288 | True 2022-12-01 22:53:09.998 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.feed_forward.w_2.bias | [256] | 256 | True 2022-12-01 22:53:09.999 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.norm1.weight | [256] | 256 | True 2022-12-01 22:53:09.999 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.norm1.bias | [256] | 256 | True 2022-12-01 22:53:09.999 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.norm2.weight | [256] | 256 | True 2022-12-01 22:53:10.000 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.norm2.bias | [256] | 256 | True 2022-12-01 22:53:10.000 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.norm3.weight | [256] | 256 | True 2022-12-01 22:53:10.001 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.norm3.bias | [256] | 256 | True 2022-12-01 22:53:10.001 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.concat_linear1.weight | [512, 256] | 131072 | True 2022-12-01 22:53:10.001 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.concat_linear1.bias | [256] | 256 | True 2022-12-01 22:53:10.002 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.concat_linear2.weight | [512, 256] | 131072 | True 2022-12-01 22:53:10.002 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.concat_linear2.bias | [256] | 256 | True 2022-12-01 22:53:10.003 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.self_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-01 22:53:10.003 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.self_attn.linear_q.bias | [256] | 256 | True 2022-12-01 22:53:10.003 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.self_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-01 22:53:10.004 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.self_attn.linear_k.bias | [256] | 256 | True 2022-12-01 22:53:10.004 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.self_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-01 22:53:10.004 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.self_attn.linear_v.bias | [256] | 256 | True 2022-12-01 22:53:10.005 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.self_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-01 22:53:10.005 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.self_attn.linear_out.bias | [256] | 256 | True 2022-12-01 22:53:10.006 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.src_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-01 22:53:10.006 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.src_attn.linear_q.bias | [256] | 256 | True 2022-12-01 22:53:10.006 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.src_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-01 22:53:10.007 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.src_attn.linear_k.bias | [256] | 256 | True 2022-12-01 22:53:10.007 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.src_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-01 22:53:10.007 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.src_attn.linear_v.bias | [256] | 256 | True 2022-12-01 22:53:10.008 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.src_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-01 22:53:10.008 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.src_attn.linear_out.bias | [256] | 256 | True 2022-12-01 22:53:10.008 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.feed_forward.w_1.weight | [256, 2048] | 524288 | True 2022-12-01 22:53:10.009 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.feed_forward.w_1.bias | [2048] | 2048 | True 2022-12-01 22:53:10.009 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.feed_forward.w_2.weight | [2048, 256] | 524288 | True 2022-12-01 22:53:10.009 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.feed_forward.w_2.bias | [256] | 256 | True 2022-12-01 22:53:10.010 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.norm1.weight | [256] | 256 | True 2022-12-01 22:53:10.010 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.norm1.bias | [256] | 256 | True 2022-12-01 22:53:10.010 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.norm2.weight | [256] | 256 | True 2022-12-01 22:53:10.011 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.norm2.bias | [256] | 256 | True 2022-12-01 22:53:10.011 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.norm3.weight | [256] | 256 | True 2022-12-01 22:53:10.011 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.norm3.bias | [256] | 256 | True 2022-12-01 22:53:10.012 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.concat_linear1.weight | [512, 256] | 131072 | True 2022-12-01 22:53:10.012 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.concat_linear1.bias | [256] | 256 | True 2022-12-01 22:53:10.013 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.concat_linear2.weight | [512, 256] | 131072 | True 2022-12-01 22:53:10.013 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.concat_linear2.bias | [256] | 256 | True 2022-12-01 22:53:10.013 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.self_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-01 22:53:10.014 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.self_attn.linear_q.bias | [256] | 256 | True 2022-12-01 22:53:10.014 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.self_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-01 22:53:10.014 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.self_attn.linear_k.bias | [256] | 256 | True 2022-12-01 22:53:10.015 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.self_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-01 22:53:10.015 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.self_attn.linear_v.bias | [256] | 256 | True 2022-12-01 22:53:10.015 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.self_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-01 22:53:10.016 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.self_attn.linear_out.bias | [256] | 256 | True 2022-12-01 22:53:10.016 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.src_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-01 22:53:10.016 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.src_attn.linear_q.bias | [256] | 256 | True 2022-12-01 22:53:10.017 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.src_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-01 22:53:10.017 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.src_attn.linear_k.bias | [256] | 256 | True 2022-12-01 22:53:10.017 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.src_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-01 22:53:10.018 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.src_attn.linear_v.bias | [256] | 256 | True 2022-12-01 22:53:10.018 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.src_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-01 22:53:10.018 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.src_attn.linear_out.bias | [256] | 256 | True 2022-12-01 22:53:10.019 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.feed_forward.w_1.weight | [256, 2048] | 524288 | True 2022-12-01 22:53:10.019 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.feed_forward.w_1.bias | [2048] | 2048 | True 2022-12-01 22:53:10.019 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.feed_forward.w_2.weight | [2048, 256] | 524288 | True 2022-12-01 22:53:10.020 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.feed_forward.w_2.bias | [256] | 256 | True 2022-12-01 22:53:10.020 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.norm1.weight | [256] | 256 | True 2022-12-01 22:53:10.020 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.norm1.bias | [256] | 256 | True 2022-12-01 22:53:10.021 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.norm2.weight | [256] | 256 | True 2022-12-01 22:53:10.021 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.norm2.bias | [256] | 256 | True 2022-12-01 22:53:10.021 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.norm3.weight | [256] | 256 | True 2022-12-01 22:53:10.022 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.norm3.bias | [256] | 256 | True 2022-12-01 22:53:10.022 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.concat_linear1.weight | [512, 256] | 131072 | True 2022-12-01 22:53:10.022 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.concat_linear1.bias | [256] | 256 | True 2022-12-01 22:53:10.023 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.concat_linear2.weight | [512, 256] | 131072 | True 2022-12-01 22:53:10.023 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.concat_linear2.bias | [256] | 256 | True 2022-12-01 22:53:10.023 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.self_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-01 22:53:10.024 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.self_attn.linear_q.bias | [256] | 256 | True 2022-12-01 22:53:10.024 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.self_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-01 22:53:10.025 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.self_attn.linear_k.bias | [256] | 256 | True 2022-12-01 22:53:10.025 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.self_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-01 22:53:10.025 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.self_attn.linear_v.bias | [256] | 256 | True 2022-12-01 22:53:10.026 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.self_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-01 22:53:10.026 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.self_attn.linear_out.bias | [256] | 256 | True 2022-12-01 22:53:10.026 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.src_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-01 22:53:10.027 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.src_attn.linear_q.bias | [256] | 256 | True 2022-12-01 22:53:10.027 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.src_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-01 22:53:10.027 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.src_attn.linear_k.bias | [256] | 256 | True 2022-12-01 22:53:10.028 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.src_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-01 22:53:10.028 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.src_attn.linear_v.bias | [256] | 256 | True 2022-12-01 22:53:10.028 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.src_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-01 22:53:10.029 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.src_attn.linear_out.bias | [256] | 256 | True 2022-12-01 22:53:10.029 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.feed_forward.w_1.weight | [256, 2048] | 524288 | True 2022-12-01 22:53:10.029 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.feed_forward.w_1.bias | [2048] | 2048 | True 2022-12-01 22:53:10.030 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.feed_forward.w_2.weight | [2048, 256] | 524288 | True 2022-12-01 22:53:10.030 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.feed_forward.w_2.bias | [256] | 256 | True 2022-12-01 22:53:10.030 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.norm1.weight | [256] | 256 | True 2022-12-01 22:53:10.031 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.norm1.bias | [256] | 256 | True 2022-12-01 22:53:10.031 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.norm2.weight | [256] | 256 | True 2022-12-01 22:53:10.031 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.norm2.bias | [256] | 256 | True 2022-12-01 22:53:10.032 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.norm3.weight | [256] | 256 | True 2022-12-01 22:53:10.032 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.norm3.bias | [256] | 256 | True 2022-12-01 22:53:10.032 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.concat_linear1.weight | [512, 256] | 131072 | True 2022-12-01 22:53:10.033 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.concat_linear1.bias | [256] | 256 | True 2022-12-01 22:53:10.033 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.concat_linear2.weight | [512, 256] | 131072 | True 2022-12-01 22:53:10.033 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.concat_linear2.bias | [256] | 256 | True 2022-12-01 22:53:10.034 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.self_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-01 22:53:10.034 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.self_attn.linear_q.bias | [256] | 256 | True 2022-12-01 22:53:10.034 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.self_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-01 22:53:10.035 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.self_attn.linear_k.bias | [256] | 256 | True 2022-12-01 22:53:10.035 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.self_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-01 22:53:10.035 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.self_attn.linear_v.bias | [256] | 256 | True 2022-12-01 22:53:10.036 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.self_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-01 22:53:10.036 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.self_attn.linear_out.bias | [256] | 256 | True 2022-12-01 22:53:10.036 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.src_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-01 22:53:10.037 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.src_attn.linear_q.bias | [256] | 256 | True 2022-12-01 22:53:10.037 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.src_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-01 22:53:10.037 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.src_attn.linear_k.bias | [256] | 256 | True 2022-12-01 22:53:10.038 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.src_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-01 22:53:10.038 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.src_attn.linear_v.bias | [256] | 256 | True 2022-12-01 22:53:10.038 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.src_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-01 22:53:10.039 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.src_attn.linear_out.bias | [256] | 256 | True 2022-12-01 22:53:10.039 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.feed_forward.w_1.weight | [256, 2048] | 524288 | True 2022-12-01 22:53:10.039 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.feed_forward.w_1.bias | [2048] | 2048 | True 2022-12-01 22:53:10.040 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.feed_forward.w_2.weight | [2048, 256] | 524288 | True 2022-12-01 22:53:10.040 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.feed_forward.w_2.bias | [256] | 256 | True 2022-12-01 22:53:10.040 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.norm1.weight | [256] | 256 | True 2022-12-01 22:53:10.041 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.norm1.bias | [256] | 256 | True 2022-12-01 22:53:10.041 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.norm2.weight | [256] | 256 | True 2022-12-01 22:53:10.041 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.norm2.bias | [256] | 256 | True 2022-12-01 22:53:10.042 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.norm3.weight | [256] | 256 | True 2022-12-01 22:53:10.042 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.norm3.bias | [256] | 256 | True 2022-12-01 22:53:10.042 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.concat_linear1.weight | [512, 256] | 131072 | True 2022-12-01 22:53:10.043 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.concat_linear1.bias | [256] | 256 | True 2022-12-01 22:53:10.043 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.concat_linear2.weight | [512, 256] | 131072 | True 2022-12-01 22:53:10.043 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.concat_linear2.bias | [256] | 256 | True 2022-12-01 22:53:10.044 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.self_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-01 22:53:10.044 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.self_attn.linear_q.bias | [256] | 256 | True 2022-12-01 22:53:10.044 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.self_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-01 22:53:10.045 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.self_attn.linear_k.bias | [256] | 256 | True 2022-12-01 22:53:10.045 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.self_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-01 22:53:10.045 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.self_attn.linear_v.bias | [256] | 256 | True 2022-12-01 22:53:10.046 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.self_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-01 22:53:10.046 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.self_attn.linear_out.bias | [256] | 256 | True 2022-12-01 22:53:10.046 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.src_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-01 22:53:10.047 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.src_attn.linear_q.bias | [256] | 256 | True 2022-12-01 22:53:10.047 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.src_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-01 22:53:10.047 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.src_attn.linear_k.bias | [256] | 256 | True 2022-12-01 22:53:10.048 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.src_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-01 22:53:10.048 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.src_attn.linear_v.bias | [256] | 256 | True 2022-12-01 22:53:10.049 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.src_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-01 22:53:10.049 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.src_attn.linear_out.bias | [256] | 256 | True 2022-12-01 22:53:10.049 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.feed_forward.w_1.weight | [256, 2048] | 524288 | True 2022-12-01 22:53:10.050 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.feed_forward.w_1.bias | [2048] | 2048 | True 2022-12-01 22:53:10.050 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.feed_forward.w_2.weight | [2048, 256] | 524288 | True 2022-12-01 22:53:10.050 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.feed_forward.w_2.bias | [256] | 256 | True 2022-12-01 22:53:10.050 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.norm1.weight | [256] | 256 | True 2022-12-01 22:53:10.051 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.norm1.bias | [256] | 256 | True 2022-12-01 22:53:10.051 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.norm2.weight | [256] | 256 | True 2022-12-01 22:53:10.051 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.norm2.bias | [256] | 256 | True 2022-12-01 22:53:10.052 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.norm3.weight | [256] | 256 | True 2022-12-01 22:53:10.052 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.norm3.bias | [256] | 256 | True 2022-12-01 22:53:10.052 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.concat_linear1.weight | [512, 256] | 131072 | True 2022-12-01 22:53:10.053 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.concat_linear1.bias | [256] | 256 | True 2022-12-01 22:53:10.053 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.concat_linear2.weight | [512, 256] | 131072 | True 2022-12-01 22:53:10.054 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.concat_linear2.bias | [256] | 256 | True 2022-12-01 22:53:10.054 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - ctc.ctc_lo.weight | [256, 4233] | 1083648 | True 2022-12-01 22:53:10.054 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - ctc.ctc_lo.bias | [4233] | 4233 | True 2022-12-01 22:53:10.055 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:60 - Total parameters: 411.0, 31.95M elements. 2022-12-01 22:53:10.055 | INFO | paddlespeech.s2t.exps.u2.model:setup_model:262 - Setup model! 2022-12-01 22:53:10.056 | INFO | paddlespeech.s2t.utils.dynamic_import:instance_class:68 - Instance: WarmupLR {'learning_rate': 0.002, 'verbose': False, 'warmup_steps': 25000}. 2022-12-01 22:53:10.084 | INFO | paddlespeech.s2t.training.optimizer:from_args:109 - <WeightDecay - L2Decay, regularization_coeff=0.000001> 2022-12-01 22:53:10.084 | INFO | paddlespeech.s2t.training.optimizer:from_args:111 - <GradClip - Gradient Clip By GlobalNorm, global_norm=5.000000> 2022-12-01 22:53:10.085 | INFO | paddlespeech.s2t.utils.dynamic_import:instance_class:68 - Instance: Adam {'grad_clip': ClipGradByGlobalNormWithLog(global_clip_norm=5.0), 'weight_decay': <paddle.regularizer.L2Decay object at 0x7ff489104c10>, 'learning_rate': WarmupLR(warmup_steps=25000, lr=0.002, last_epoch=0)}. 2022-12-01 22:53:10.085 | INFO | paddlespeech.s2t.training.optimizer:from_args:120 - LR: WarmupLR(warmup_steps=25000, lr=0.002, last_epoch=0) 2022-12-01 22:53:10.085 | INFO | paddlespeech.s2t.exps.u2.model:setup_model:308 - Setup optimizer/lr_scheduler! 2022-12-01 22:53:10.086 | INFO | paddlespeech.s2t.training.trainer:resume_or_scratch:221 - Init from scratch! 2022-12-01 22:53:10.516 | INFO | paddlespeech.s2t.utils.checkpoint:_save_parameters:286 - Saved model to exp/transformer/checkpoints/init.pdparams 2022-12-01 22:53:10.519 | INFO | paddlespeech.s2t.utils.checkpoint:_save_parameters:292 - Saved optimzier state to exp/transformer/checkpoints/init.pdopt 2022-12-01 22:53:10.522 | INFO | paddlespeech.s2t.exps.u2.model:do_train:161 - Train Total Examples: 15013

navy7913 commented 1 year ago

为什么安装两个版本的 paddle? paddlepaddle 2.3.1 、paddlepaddle-gpu 2.4.0rc0

请教一下这两种paddle只需要装其中一种就行了吗?

zxcd commented 1 year ago

根据你的log,目前显示你的模型成功加载了,未见报错?以及paddle只需要安装一个版本就可以了

navy7913 commented 1 year ago

根据你的log,目前显示你的模型成功加载了,未见报错?以及paddle只需要安装一个版本就可以了

您好,我有个疑问stage 1如果没有错误的话就会开始训练了嘛? 以下是我将paddlepaddle删除只留下paddlepaddle-gpu 2.4.0rc0,再次执行bash run.sh --stage 1 --stop_stage 1的结果,不过我的看起来好像没有在训练,想请教您问题该 如何解决,我真的很想成功训练它,然后我的log是在examples/aishell/asr1/exp/log里面找到的,不知道是不是您需要的资料

2022-12-04 15:46:36.196 | DEBUG | paddlespeech.s2t::41 - register user softmax to paddle, remove this when fixed! 2022-12-04 15:46:36.197 | DEBUG | paddlespeech.s2t::45 - register user log_softmax to paddle, remove this when fixed! 2022-12-04 15:46:36.197 | DEBUG | paddlespeech.s2t::49 - register user sigmoid to paddle, remove this when fixed! 2022-12-04 15:46:36.197 | DEBUG | paddlespeech.s2t::53 - register user log_sigmoid to paddle, remove this when fixed! 2022-12-04 15:46:36.197 | DEBUG | paddlespeech.s2t::57 - register user relu to paddle, remove this when fixed! 2022-12-04 15:46:36.197 | DEBUG | paddlespeech.s2t::67 - override cat of paddle if exists or register, remove this when fixed! 2022-12-04 15:46:36.197 | DEBUG | paddlespeech.s2t::89 - override long of paddle.Tensor if exists or register, remove this when fixed! 2022-12-04 15:46:36.197 | DEBUG | paddlespeech.s2t::111 - override new_full of paddle.Tensor if exists or register, remove this when fixed! 2022-12-04 15:46:36.197 | DEBUG | paddlespeech.s2t::123 - override contiguous of paddle.Tensor if exists or register, remove this when fixed! 2022-12-04 15:46:36.198 | DEBUG | paddlespeech.s2t::134 - register user view to paddle.Tensor, remove this when fixed! 2022-12-04 15:46:36.198 | DEBUG | paddlespeech.s2t::145 - register user view_as to paddle.Tensor, remove this when fixed! 2022-12-04 15:46:36.198 | DEBUG | paddlespeech.s2t::186 - register user masked_fill to paddle.Tensor, remove this when fixed! 2022-12-04 15:46:36.198 | DEBUG | paddlespeech.s2t::205 - register user maskedfill to paddle.Tensor, remove this when fixed! 2022-12-04 15:46:36.198 | DEBUG | paddlespeech.s2t::229 - register user repeat to paddle.Tensor, remove this when fixed! 2022-12-04 15:46:36.198 | DEBUG | paddlespeech.s2t::235 - register user softmax to paddle.Tensor, remove this when fixed! 2022-12-04 15:46:36.198 | DEBUG | paddlespeech.s2t::240 - register user sigmoid to paddle.Tensor, remove this when fixed! 2022-12-04 15:46:36.199 | DEBUG | paddlespeech.s2t::244 - register user relu to paddle.Tensor, remove this when fixed! 2022-12-04 15:46:36.199 | DEBUG | paddlespeech.s2t::254 - register user type_as to paddle.Tensor, remove this when fixed! 2022-12-04 15:46:36.199 | DEBUG | paddlespeech.s2t::270 - register user to to paddle.Tensor, remove this when fixed! 2022-12-04 15:46:36.199 | DEBUG | paddlespeech.s2t::281 - register user float to paddle.Tensor, remove this when fixed! 2022-12-04 15:46:36.199 | DEBUG | paddlespeech.s2t::291 - register user int to paddle.Tensor, remove this when fixed! 2022-12-04 15:46:36.797 | INFO | paddlespeech.s2t.utils.utility:all_version:45 - Deps Module Version:[('python', '3.7.15 (default, Nov 7 2022, 22:00:21) \n[GCC 11.2.0]'), ('paddle', '2.4.0-rc0'), ('paddle_commit', '083853cd4e4a9bdad22c70fa48eb9a036d2def27'), ('soundfile', '0.11.0')] 2022-12-04 15:46:36.797 | INFO | paddlespeech.s2t.training.trainer:init:116 - Rank: 0/1 2022-12-04 15:46:38.617 | INFO | paddlespeech.s2t.io.batchfy:make_batchset:400 - count is auto detected as seq 2022-12-04 15:46:38.915 | INFO | paddlespeech.s2t.io.batchfy:make_batchset:424 - # utts: 120098 2022-12-04 15:46:38.962 | INFO | paddlespeech.s2t.io.batchfy:make_batchset:467 - # minibatches: 15013 2022-12-04 15:46:39.031 | WARNING | paddlespeech.s2t.io.reader:init:76 - [Experimental feature] Some preprocessing will be done for the mini-batch creation using Transformation( 0: LogMelSpectrogramKaldi(fs=16000, n_mels=80, n_frame_shift=10.0, n_frame_length=25.0, dither=0.1)) 1: GlobalCMVN( cmvn_path=data/mean_std.json, norm_means=True, norm_vars=True,) 2: TimeWarp(max_time_warp=5, inplace=True, mode=PIL) 3: FreqMask(F=30, n_mask=2, replace_with_zero=False, inplace=True) 4: TimeMask(T=40, n_mask=2, replace_with_zero=False, inplace=True)) 2022-12-04 15:46:39.199 | INFO | paddlespeech.s2t.io.batchfy:make_batchset:400 - count is auto detected as seq 2022-12-04 15:46:39.218 | INFO | paddlespeech.s2t.io.batchfy:make_batchset:424 - # utts: 14326 2022-12-04 15:46:39.223 | INFO | paddlespeech.s2t.io.batchfy:make_batchset:467 - # minibatches: 1791 2022-12-04 15:46:39.229 | WARNING | paddlespeech.s2t.io.reader:init:76 - [Experimental feature] Some preprocessing will be done for the mini-batch creation using Transformation( 0: LogMelSpectrogramKaldi(fs=16000, n_mels=80, n_frame_shift=10.0, n_frame_length=25.0, dither=0.1)) 1: GlobalCMVN( cmvn_path=data/mean_std.json, norm_means=True, norm_vars=True,) 2: TimeWarp(max_time_warp=5, inplace=True, mode=PIL) 3: FreqMask(F=30, n_mask=2, replace_with_zero=False, inplace=True) 4: TimeMask(T=40, n_mask=2, replace_with_zero=False, inplace=True)) 2022-12-04 15:46:39.229 | INFO | paddlespeech.s2t.exps.u2.model:setup_dataloader:233 - Setup train/valid Dataloader! 2022-12-04 15:46:39.229 | DEBUG | paddlespeech.s2t.models.u2.u2:_init_from_config:901 - U2 Encoder type: transformer 2022-12-04 15:46:39.358 | DEBUG | paddlespeech.s2t.models.u2.u2:_init_from_config:913 - U2 Decoder type: transformer 2022-12-04 15:46:39.469 | DEBUG | paddlespeech.s2t.modules.loss:init:41 - CTCLoss Loss reduction: sum, div-bs: True 2022-12-04 15:46:39.469 | DEBUG | paddlespeech.s2t.modules.loss:init:42 - CTCLoss Grad Norm Type: None 2022-12-04 15:46:39.470 | DEBUG | paddlespeech.s2t.modules.loss:init:74 - CTCLoss() kwargs:{'norm_by_times': False}, not support: {'norm_by_batchsize': False, 'norm_by_total_logits_len': False} 2022-12-04 15:46:39.473 | INFO | paddlespeech.s2t.exps.u2.model:setup_model:259 - U2Model( (encoder): TransformerEncoder( (embed): Conv2dSubsampling4( (pos_enc): PositionalEncoding( (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) ) (conv): Sequential( (0): Conv2D(1, 256, kernel_size=[3, 3], stride=[2, 2], data_format=NCHW) (1): ReLU() (2): Conv2D(256, 256, kernel_size=[3, 3], stride=[2, 2], data_format=NCHW) (3): ReLU() ) (out): Sequential( (0): Linear(in_features=4864, out_features=256, dtype=float32) ) ) (after_norm): LayerNorm(normalized_shape=[256], epsilon=1e-12) (encoders): LayerList( (0): TransformerEncoderLayer( (self_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (feed_forward): PositionwiseFeedForward( (w_1): Linear(in_features=256, out_features=2048, dtype=float32) (activation): ReLU() (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (w_2): Linear(in_features=2048, out_features=256, dtype=float32) ) (norm1): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm2): LayerNorm(normalized_shape=[256], epsilon=1e-12) (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (concat_linear): Linear(in_features=512, out_features=256, dtype=float32) ) (1): TransformerEncoderLayer( (self_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (feed_forward): PositionwiseFeedForward( (w_1): Linear(in_features=256, out_features=2048, dtype=float32) (activation): ReLU() (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (w_2): Linear(in_features=2048, out_features=256, dtype=float32) ) (norm1): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm2): LayerNorm(normalized_shape=[256], epsilon=1e-12) (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (concat_linear): Linear(in_features=512, out_features=256, dtype=float32) ) (2): TransformerEncoderLayer( (self_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (feed_forward): PositionwiseFeedForward( (w_1): Linear(in_features=256, out_features=2048, dtype=float32) (activation): ReLU() (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (w_2): Linear(in_features=2048, out_features=256, dtype=float32) ) (norm1): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm2): LayerNorm(normalized_shape=[256], epsilon=1e-12) (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (concat_linear): Linear(in_features=512, out_features=256, dtype=float32) ) (3): TransformerEncoderLayer( (self_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (feed_forward): PositionwiseFeedForward( (w_1): Linear(in_features=256, out_features=2048, dtype=float32) (activation): ReLU() (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (w_2): Linear(in_features=2048, out_features=256, dtype=float32) ) (norm1): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm2): LayerNorm(normalized_shape=[256], epsilon=1e-12) (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (concat_linear): Linear(in_features=512, out_features=256, dtype=float32) ) (4): TransformerEncoderLayer( (self_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (feed_forward): PositionwiseFeedForward( (w_1): Linear(in_features=256, out_features=2048, dtype=float32) (activation): ReLU() (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (w_2): Linear(in_features=2048, out_features=256, dtype=float32) ) (norm1): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm2): LayerNorm(normalized_shape=[256], epsilon=1e-12) (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (concat_linear): Linear(in_features=512, out_features=256, dtype=float32) ) (5): TransformerEncoderLayer( (self_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (feed_forward): PositionwiseFeedForward( (w_1): Linear(in_features=256, out_features=2048, dtype=float32) (activation): ReLU() (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (w_2): Linear(in_features=2048, out_features=256, dtype=float32) ) (norm1): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm2): LayerNorm(normalized_shape=[256], epsilon=1e-12) (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (concat_linear): Linear(in_features=512, out_features=256, dtype=float32) ) (6): TransformerEncoderLayer( (self_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (feed_forward): PositionwiseFeedForward( (w_1): Linear(in_features=256, out_features=2048, dtype=float32) (activation): ReLU() (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (w_2): Linear(in_features=2048, out_features=256, dtype=float32) ) (norm1): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm2): LayerNorm(normalized_shape=[256], epsilon=1e-12) (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (concat_linear): Linear(in_features=512, out_features=256, dtype=float32) ) (7): TransformerEncoderLayer( (self_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (feed_forward): PositionwiseFeedForward( (w_1): Linear(in_features=256, out_features=2048, dtype=float32) (activation): ReLU() (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (w_2): Linear(in_features=2048, out_features=256, dtype=float32) ) (norm1): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm2): LayerNorm(normalized_shape=[256], epsilon=1e-12) (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (concat_linear): Linear(in_features=512, out_features=256, dtype=float32) ) (8): TransformerEncoderLayer( (self_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (feed_forward): PositionwiseFeedForward( (w_1): Linear(in_features=256, out_features=2048, dtype=float32) (activation): ReLU() (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (w_2): Linear(in_features=2048, out_features=256, dtype=float32) ) (norm1): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm2): LayerNorm(normalized_shape=[256], epsilon=1e-12) (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (concat_linear): Linear(in_features=512, out_features=256, dtype=float32) ) (9): TransformerEncoderLayer( (self_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (feed_forward): PositionwiseFeedForward( (w_1): Linear(in_features=256, out_features=2048, dtype=float32) (activation): ReLU() (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (w_2): Linear(in_features=2048, out_features=256, dtype=float32) ) (norm1): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm2): LayerNorm(normalized_shape=[256], epsilon=1e-12) (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (concat_linear): Linear(in_features=512, out_features=256, dtype=float32) ) (10): TransformerEncoderLayer( (self_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (feed_forward): PositionwiseFeedForward( (w_1): Linear(in_features=256, out_features=2048, dtype=float32) (activation): ReLU() (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (w_2): Linear(in_features=2048, out_features=256, dtype=float32) ) (norm1): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm2): LayerNorm(normalized_shape=[256], epsilon=1e-12) (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (concat_linear): Linear(in_features=512, out_features=256, dtype=float32) ) (11): TransformerEncoderLayer( (self_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (feed_forward): PositionwiseFeedForward( (w_1): Linear(in_features=256, out_features=2048, dtype=float32) (activation): ReLU() (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (w_2): Linear(in_features=2048, out_features=256, dtype=float32) ) (norm1): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm2): LayerNorm(normalized_shape=[256], epsilon=1e-12) (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (concat_linear): Linear(in_features=512, out_features=256, dtype=float32) ) ) ) (decoder): TransformerDecoder( (embed): Sequential( (0): Embedding(4233, 256, sparse=False) (1): PositionalEncoding( (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) ) ) (after_norm): LayerNorm(normalized_shape=[256], epsilon=1e-12) (output_layer): Linear(in_features=256, out_features=4233, dtype=float32) (decoders): LayerList( (0): DecoderLayer( (self_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (src_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (feed_forward): PositionwiseFeedForward( (w_1): Linear(in_features=256, out_features=2048, dtype=float32) (activation): ReLU() (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (w_2): Linear(in_features=2048, out_features=256, dtype=float32) ) (norm1): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm2): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm3): LayerNorm(normalized_shape=[256], epsilon=1e-12) (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (concat_linear1): Linear(in_features=512, out_features=256, dtype=float32) (concat_linear2): Linear(in_features=512, out_features=256, dtype=float32) ) (1): DecoderLayer( (self_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (src_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (feed_forward): PositionwiseFeedForward( (w_1): Linear(in_features=256, out_features=2048, dtype=float32) (activation): ReLU() (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (w_2): Linear(in_features=2048, out_features=256, dtype=float32) ) (norm1): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm2): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm3): LayerNorm(normalized_shape=[256], epsilon=1e-12) (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (concat_linear1): Linear(in_features=512, out_features=256, dtype=float32) (concat_linear2): Linear(in_features=512, out_features=256, dtype=float32) ) (2): DecoderLayer( (self_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (src_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (feed_forward): PositionwiseFeedForward( (w_1): Linear(in_features=256, out_features=2048, dtype=float32) (activation): ReLU() (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (w_2): Linear(in_features=2048, out_features=256, dtype=float32) ) (norm1): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm2): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm3): LayerNorm(normalized_shape=[256], epsilon=1e-12) (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (concat_linear1): Linear(in_features=512, out_features=256, dtype=float32) (concat_linear2): Linear(in_features=512, out_features=256, dtype=float32) ) (3): DecoderLayer( (self_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (src_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (feed_forward): PositionwiseFeedForward( (w_1): Linear(in_features=256, out_features=2048, dtype=float32) (activation): ReLU() (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (w_2): Linear(in_features=2048, out_features=256, dtype=float32) ) (norm1): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm2): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm3): LayerNorm(normalized_shape=[256], epsilon=1e-12) (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (concat_linear1): Linear(in_features=512, out_features=256, dtype=float32) (concat_linear2): Linear(in_features=512, out_features=256, dtype=float32) ) (4): DecoderLayer( (self_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (src_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (feed_forward): PositionwiseFeedForward( (w_1): Linear(in_features=256, out_features=2048, dtype=float32) (activation): ReLU() (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (w_2): Linear(in_features=2048, out_features=256, dtype=float32) ) (norm1): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm2): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm3): LayerNorm(normalized_shape=[256], epsilon=1e-12) (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (concat_linear1): Linear(in_features=512, out_features=256, dtype=float32) (concat_linear2): Linear(in_features=512, out_features=256, dtype=float32) ) (5): DecoderLayer( (self_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (src_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (feed_forward): PositionwiseFeedForward( (w_1): Linear(in_features=256, out_features=2048, dtype=float32) (activation): ReLU() (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (w_2): Linear(in_features=2048, out_features=256, dtype=float32) ) (norm1): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm2): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm3): LayerNorm(normalized_shape=[256], epsilon=1e-12) (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (concat_linear1): Linear(in_features=512, out_features=256, dtype=float32) (concat_linear2): Linear(in_features=512, out_features=256, dtype=float32) ) ) ) (ctc): CTCDecoderBase( (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) (ctc_lo): Linear(in_features=256, out_features=4233, dtype=float32) (criterion): CTCLoss( (loss): CTCLoss() ) ) (criterion_att): LabelSmoothingLoss( (criterion): KLDivLoss() ) ) 2022-12-04 15:46:39.473 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.embed.conv.0.weight | [256, 1, 3, 3] | 2304 | True 2022-12-04 15:46:39.473 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.embed.conv.0.bias | [256] | 256 | True 2022-12-04 15:46:39.473 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.embed.conv.2.weight | [256, 256, 3, 3] | 589824 | True 2022-12-04 15:46:39.474 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.embed.conv.2.bias | [256] | 256 | True 2022-12-04 15:46:39.474 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.embed.out.0.weight | [4864, 256] | 1245184 | True 2022-12-04 15:46:39.474 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.embed.out.0.bias | [256] | 256 | True 2022-12-04 15:46:39.474 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.after_norm.weight | [256] | 256 | True 2022-12-04 15:46:39.475 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.after_norm.bias | [256] | 256 | True 2022-12-04 15:46:39.475 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.0.self_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.475 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.0.self_attn.linear_q.bias | [256] | 256 | True 2022-12-04 15:46:39.476 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.0.self_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.476 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.0.self_attn.linear_k.bias | [256] | 256 | True 2022-12-04 15:46:39.477 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.0.self_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.477 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.0.self_attn.linear_v.bias | [256] | 256 | True 2022-12-04 15:46:39.478 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.0.self_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.478 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.0.self_attn.linear_out.bias | [256] | 256 | True 2022-12-04 15:46:39.478 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.0.feed_forward.w_1.weight | [256, 2048] | 524288 | True 2022-12-04 15:46:39.478 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.0.feed_forward.w_1.bias | [2048] | 2048 | True 2022-12-04 15:46:39.479 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.0.feed_forward.w_2.weight | [2048, 256] | 524288 | True 2022-12-04 15:46:39.479 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.0.feed_forward.w_2.bias | [256] | 256 | True 2022-12-04 15:46:39.480 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.0.norm1.weight | [256] | 256 | True 2022-12-04 15:46:39.480 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.0.norm1.bias | [256] | 256 | True 2022-12-04 15:46:39.480 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.0.norm2.weight | [256] | 256 | True 2022-12-04 15:46:39.481 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.0.norm2.bias | [256] | 256 | True 2022-12-04 15:46:39.481 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.0.concat_linear.weight | [512, 256] | 131072 | True 2022-12-04 15:46:39.481 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.0.concat_linear.bias | [256] | 256 | True 2022-12-04 15:46:39.482 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.1.self_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.482 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.1.self_attn.linear_q.bias | [256] | 256 | True 2022-12-04 15:46:39.482 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.1.self_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.483 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.1.self_attn.linear_k.bias | [256] | 256 | True 2022-12-04 15:46:39.483 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.1.self_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.483 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.1.self_attn.linear_v.bias | [256] | 256 | True 2022-12-04 15:46:39.484 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.1.self_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.484 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.1.self_attn.linear_out.bias | [256] | 256 | True 2022-12-04 15:46:39.485 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.1.feed_forward.w_1.weight | [256, 2048] | 524288 | True 2022-12-04 15:46:39.485 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.1.feed_forward.w_1.bias | [2048] | 2048 | True 2022-12-04 15:46:39.485 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.1.feed_forward.w_2.weight | [2048, 256] | 524288 | True 2022-12-04 15:46:39.486 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.1.feed_forward.w_2.bias | [256] | 256 | True 2022-12-04 15:46:39.486 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.1.norm1.weight | [256] | 256 | True 2022-12-04 15:46:39.486 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.1.norm1.bias | [256] | 256 | True 2022-12-04 15:46:39.486 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.1.norm2.weight | [256] | 256 | True 2022-12-04 15:46:39.487 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.1.norm2.bias | [256] | 256 | True 2022-12-04 15:46:39.487 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.1.concat_linear.weight | [512, 256] | 131072 | True 2022-12-04 15:46:39.487 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.1.concat_linear.bias | [256] | 256 | True 2022-12-04 15:46:39.488 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.2.self_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.488 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.2.self_attn.linear_q.bias | [256] | 256 | True 2022-12-04 15:46:39.489 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.2.self_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.489 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.2.self_attn.linear_k.bias | [256] | 256 | True 2022-12-04 15:46:39.489 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.2.self_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.490 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.2.self_attn.linear_v.bias | [256] | 256 | True 2022-12-04 15:46:39.490 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.2.self_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.490 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.2.self_attn.linear_out.bias | [256] | 256 | True 2022-12-04 15:46:39.491 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.2.feed_forward.w_1.weight | [256, 2048] | 524288 | True 2022-12-04 15:46:39.491 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.2.feed_forward.w_1.bias | [2048] | 2048 | True 2022-12-04 15:46:39.491 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.2.feed_forward.w_2.weight | [2048, 256] | 524288 | True 2022-12-04 15:46:39.492 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.2.feed_forward.w_2.bias | [256] | 256 | True 2022-12-04 15:46:39.492 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.2.norm1.weight | [256] | 256 | True 2022-12-04 15:46:39.492 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.2.norm1.bias | [256] | 256 | True 2022-12-04 15:46:39.492 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.2.norm2.weight | [256] | 256 | True 2022-12-04 15:46:39.493 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.2.norm2.bias | [256] | 256 | True 2022-12-04 15:46:39.493 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.2.concat_linear.weight | [512, 256] | 131072 | True 2022-12-04 15:46:39.493 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.2.concat_linear.bias | [256] | 256 | True 2022-12-04 15:46:39.494 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.3.self_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.494 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.3.self_attn.linear_q.bias | [256] | 256 | True 2022-12-04 15:46:39.495 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.3.self_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.495 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.3.self_attn.linear_k.bias | [256] | 256 | True 2022-12-04 15:46:39.495 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.3.self_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.496 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.3.self_attn.linear_v.bias | [256] | 256 | True 2022-12-04 15:46:39.496 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.3.self_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.496 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.3.self_attn.linear_out.bias | [256] | 256 | True 2022-12-04 15:46:39.497 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.3.feed_forward.w_1.weight | [256, 2048] | 524288 | True 2022-12-04 15:46:39.497 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.3.feed_forward.w_1.bias | [2048] | 2048 | True 2022-12-04 15:46:39.497 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.3.feed_forward.w_2.weight | [2048, 256] | 524288 | True 2022-12-04 15:46:39.498 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.3.feed_forward.w_2.bias | [256] | 256 | True 2022-12-04 15:46:39.498 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.3.norm1.weight | [256] | 256 | True 2022-12-04 15:46:39.498 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.3.norm1.bias | [256] | 256 | True 2022-12-04 15:46:39.499 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.3.norm2.weight | [256] | 256 | True 2022-12-04 15:46:39.499 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.3.norm2.bias | [256] | 256 | True 2022-12-04 15:46:39.499 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.3.concat_linear.weight | [512, 256] | 131072 | True 2022-12-04 15:46:39.500 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.3.concat_linear.bias | [256] | 256 | True 2022-12-04 15:46:39.500 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.4.self_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.501 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.4.self_attn.linear_q.bias | [256] | 256 | True 2022-12-04 15:46:39.501 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.4.self_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.501 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.4.self_attn.linear_k.bias | [256] | 256 | True 2022-12-04 15:46:39.502 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.4.self_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.502 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.4.self_attn.linear_v.bias | [256] | 256 | True 2022-12-04 15:46:39.502 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.4.self_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.503 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.4.self_attn.linear_out.bias | [256] | 256 | True 2022-12-04 15:46:39.503 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.4.feed_forward.w_1.weight | [256, 2048] | 524288 | True 2022-12-04 15:46:39.503 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.4.feed_forward.w_1.bias | [2048] | 2048 | True 2022-12-04 15:46:39.504 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.4.feed_forward.w_2.weight | [2048, 256] | 524288 | True 2022-12-04 15:46:39.504 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.4.feed_forward.w_2.bias | [256] | 256 | True 2022-12-04 15:46:39.504 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.4.norm1.weight | [256] | 256 | True 2022-12-04 15:46:39.505 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.4.norm1.bias | [256] | 256 | True 2022-12-04 15:46:39.505 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.4.norm2.weight | [256] | 256 | True 2022-12-04 15:46:39.505 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.4.norm2.bias | [256] | 256 | True 2022-12-04 15:46:39.506 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.4.concat_linear.weight | [512, 256] | 131072 | True 2022-12-04 15:46:39.506 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.4.concat_linear.bias | [256] | 256 | True 2022-12-04 15:46:39.506 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.5.self_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.507 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.5.self_attn.linear_q.bias | [256] | 256 | True 2022-12-04 15:46:39.507 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.5.self_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.507 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.5.self_attn.linear_k.bias | [256] | 256 | True 2022-12-04 15:46:39.508 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.5.self_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.508 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.5.self_attn.linear_v.bias | [256] | 256 | True 2022-12-04 15:46:39.508 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.5.self_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.509 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.5.self_attn.linear_out.bias | [256] | 256 | True 2022-12-04 15:46:39.509 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.5.feed_forward.w_1.weight | [256, 2048] | 524288 | True 2022-12-04 15:46:39.509 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.5.feed_forward.w_1.bias | [2048] | 2048 | True 2022-12-04 15:46:39.510 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.5.feed_forward.w_2.weight | [2048, 256] | 524288 | True 2022-12-04 15:46:39.510 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.5.feed_forward.w_2.bias | [256] | 256 | True 2022-12-04 15:46:39.510 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.5.norm1.weight | [256] | 256 | True 2022-12-04 15:46:39.511 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.5.norm1.bias | [256] | 256 | True 2022-12-04 15:46:39.511 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.5.norm2.weight | [256] | 256 | True 2022-12-04 15:46:39.511 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.5.norm2.bias | [256] | 256 | True 2022-12-04 15:46:39.511 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.5.concat_linear.weight | [512, 256] | 131072 | True 2022-12-04 15:46:39.512 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.5.concat_linear.bias | [256] | 256 | True 2022-12-04 15:46:39.512 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.6.self_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.512 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.6.self_attn.linear_q.bias | [256] | 256 | True 2022-12-04 15:46:39.513 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.6.self_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.513 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.6.self_attn.linear_k.bias | [256] | 256 | True 2022-12-04 15:46:39.513 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.6.self_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.514 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.6.self_attn.linear_v.bias | [256] | 256 | True 2022-12-04 15:46:39.514 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.6.self_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.514 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.6.self_attn.linear_out.bias | [256] | 256 | True 2022-12-04 15:46:39.515 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.6.feed_forward.w_1.weight | [256, 2048] | 524288 | True 2022-12-04 15:46:39.515 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.6.feed_forward.w_1.bias | [2048] | 2048 | True 2022-12-04 15:46:39.515 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.6.feed_forward.w_2.weight | [2048, 256] | 524288 | True 2022-12-04 15:46:39.516 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.6.feed_forward.w_2.bias | [256] | 256 | True 2022-12-04 15:46:39.516 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.6.norm1.weight | [256] | 256 | True 2022-12-04 15:46:39.517 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.6.norm1.bias | [256] | 256 | True 2022-12-04 15:46:39.517 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.6.norm2.weight | [256] | 256 | True 2022-12-04 15:46:39.517 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.6.norm2.bias | [256] | 256 | True 2022-12-04 15:46:39.518 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.6.concat_linear.weight | [512, 256] | 131072 | True 2022-12-04 15:46:39.518 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.6.concat_linear.bias | [256] | 256 | True 2022-12-04 15:46:39.518 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.7.self_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.519 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.7.self_attn.linear_q.bias | [256] | 256 | True 2022-12-04 15:46:39.519 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.7.self_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.519 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.7.self_attn.linear_k.bias | [256] | 256 | True 2022-12-04 15:46:39.520 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.7.self_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.520 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.7.self_attn.linear_v.bias | [256] | 256 | True 2022-12-04 15:46:39.520 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.7.self_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.521 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.7.self_attn.linear_out.bias | [256] | 256 | True 2022-12-04 15:46:39.521 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.7.feed_forward.w_1.weight | [256, 2048] | 524288 | True 2022-12-04 15:46:39.521 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.7.feed_forward.w_1.bias | [2048] | 2048 | True 2022-12-04 15:46:39.522 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.7.feed_forward.w_2.weight | [2048, 256] | 524288 | True 2022-12-04 15:46:39.522 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.7.feed_forward.w_2.bias | [256] | 256 | True 2022-12-04 15:46:39.522 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.7.norm1.weight | [256] | 256 | True 2022-12-04 15:46:39.522 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.7.norm1.bias | [256] | 256 | True 2022-12-04 15:46:39.523 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.7.norm2.weight | [256] | 256 | True 2022-12-04 15:46:39.523 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.7.norm2.bias | [256] | 256 | True 2022-12-04 15:46:39.524 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.7.concat_linear.weight | [512, 256] | 131072 | True 2022-12-04 15:46:39.524 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.7.concat_linear.bias | [256] | 256 | True 2022-12-04 15:46:39.524 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.8.self_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.525 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.8.self_attn.linear_q.bias | [256] | 256 | True 2022-12-04 15:46:39.525 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.8.self_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.525 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.8.self_attn.linear_k.bias | [256] | 256 | True 2022-12-04 15:46:39.526 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.8.self_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.526 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.8.self_attn.linear_v.bias | [256] | 256 | True 2022-12-04 15:46:39.526 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.8.self_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.526 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.8.self_attn.linear_out.bias | [256] | 256 | True 2022-12-04 15:46:39.527 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.8.feed_forward.w_1.weight | [256, 2048] | 524288 | True 2022-12-04 15:46:39.527 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.8.feed_forward.w_1.bias | [2048] | 2048 | True 2022-12-04 15:46:39.527 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.8.feed_forward.w_2.weight | [2048, 256] | 524288 | True 2022-12-04 15:46:39.528 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.8.feed_forward.w_2.bias | [256] | 256 | True 2022-12-04 15:46:39.528 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.8.norm1.weight | [256] | 256 | True 2022-12-04 15:46:39.528 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.8.norm1.bias | [256] | 256 | True 2022-12-04 15:46:39.529 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.8.norm2.weight | [256] | 256 | True 2022-12-04 15:46:39.529 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.8.norm2.bias | [256] | 256 | True 2022-12-04 15:46:39.529 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.8.concat_linear.weight | [512, 256] | 131072 | True 2022-12-04 15:46:39.530 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.8.concat_linear.bias | [256] | 256 | True 2022-12-04 15:46:39.530 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.9.self_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.531 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.9.self_attn.linear_q.bias | [256] | 256 | True 2022-12-04 15:46:39.531 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.9.self_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.531 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.9.self_attn.linear_k.bias | [256] | 256 | True 2022-12-04 15:46:39.532 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.9.self_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.532 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.9.self_attn.linear_v.bias | [256] | 256 | True 2022-12-04 15:46:39.532 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.9.self_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.533 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.9.self_attn.linear_out.bias | [256] | 256 | True 2022-12-04 15:46:39.533 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.9.feed_forward.w_1.weight | [256, 2048] | 524288 | True 2022-12-04 15:46:39.533 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.9.feed_forward.w_1.bias | [2048] | 2048 | True 2022-12-04 15:46:39.534 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.9.feed_forward.w_2.weight | [2048, 256] | 524288 | True 2022-12-04 15:46:39.534 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.9.feed_forward.w_2.bias | [256] | 256 | True 2022-12-04 15:46:39.534 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.9.norm1.weight | [256] | 256 | True 2022-12-04 15:46:39.535 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.9.norm1.bias | [256] | 256 | True 2022-12-04 15:46:39.535 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.9.norm2.weight | [256] | 256 | True 2022-12-04 15:46:39.535 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.9.norm2.bias | [256] | 256 | True 2022-12-04 15:46:39.536 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.9.concat_linear.weight | [512, 256] | 131072 | True 2022-12-04 15:46:39.536 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.9.concat_linear.bias | [256] | 256 | True 2022-12-04 15:46:39.536 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.10.self_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.537 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.10.self_attn.linear_q.bias | [256] | 256 | True 2022-12-04 15:46:39.537 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.10.self_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.537 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.10.self_attn.linear_k.bias | [256] | 256 | True 2022-12-04 15:46:39.538 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.10.self_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.538 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.10.self_attn.linear_v.bias | [256] | 256 | True 2022-12-04 15:46:39.538 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.10.self_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.539 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.10.self_attn.linear_out.bias | [256] | 256 | True 2022-12-04 15:46:39.539 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.10.feed_forward.w_1.weight | [256, 2048] | 524288 | True 2022-12-04 15:46:39.539 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.10.feed_forward.w_1.bias | [2048] | 2048 | True 2022-12-04 15:46:39.540 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.10.feed_forward.w_2.weight | [2048, 256] | 524288 | True 2022-12-04 15:46:39.540 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.10.feed_forward.w_2.bias | [256] | 256 | True 2022-12-04 15:46:39.540 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.10.norm1.weight | [256] | 256 | True 2022-12-04 15:46:39.541 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.10.norm1.bias | [256] | 256 | True 2022-12-04 15:46:39.541 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.10.norm2.weight | [256] | 256 | True 2022-12-04 15:46:39.541 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.10.norm2.bias | [256] | 256 | True 2022-12-04 15:46:39.542 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.10.concat_linear.weight | [512, 256] | 131072 | True 2022-12-04 15:46:39.542 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.10.concat_linear.bias | [256] | 256 | True 2022-12-04 15:46:39.542 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.11.self_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.543 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.11.self_attn.linear_q.bias | [256] | 256 | True 2022-12-04 15:46:39.543 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.11.self_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.543 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.11.self_attn.linear_k.bias | [256] | 256 | True 2022-12-04 15:46:39.544 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.11.self_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.544 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.11.self_attn.linear_v.bias | [256] | 256 | True 2022-12-04 15:46:39.544 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.11.self_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.545 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.11.self_attn.linear_out.bias | [256] | 256 | True 2022-12-04 15:46:39.545 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.11.feed_forward.w_1.weight | [256, 2048] | 524288 | True 2022-12-04 15:46:39.545 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.11.feed_forward.w_1.bias | [2048] | 2048 | True 2022-12-04 15:46:39.545 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.11.feed_forward.w_2.weight | [2048, 256] | 524288 | True 2022-12-04 15:46:39.546 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.11.feed_forward.w_2.bias | [256] | 256 | True 2022-12-04 15:46:39.546 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.11.norm1.weight | [256] | 256 | True 2022-12-04 15:46:39.547 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.11.norm1.bias | [256] | 256 | True 2022-12-04 15:46:39.547 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.11.norm2.weight | [256] | 256 | True 2022-12-04 15:46:39.547 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.11.norm2.bias | [256] | 256 | True 2022-12-04 15:46:39.547 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.11.concat_linear.weight | [512, 256] | 131072 | True 2022-12-04 15:46:39.548 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.11.concat_linear.bias | [256] | 256 | True 2022-12-04 15:46:39.548 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.embed.0.weight | [4233, 256] | 1083648 | True 2022-12-04 15:46:39.548 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.after_norm.weight | [256] | 256 | True 2022-12-04 15:46:39.549 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.after_norm.bias | [256] | 256 | True 2022-12-04 15:46:39.549 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.output_layer.weight | [256, 4233] | 1083648 | True 2022-12-04 15:46:39.549 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.output_layer.bias | [4233] | 4233 | True 2022-12-04 15:46:39.550 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.self_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.550 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.self_attn.linear_q.bias | [256] | 256 | True 2022-12-04 15:46:39.550 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.self_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.551 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.self_attn.linear_k.bias | [256] | 256 | True 2022-12-04 15:46:39.551 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.self_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.551 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.self_attn.linear_v.bias | [256] | 256 | True 2022-12-04 15:46:39.551 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.self_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.552 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.self_attn.linear_out.bias | [256] | 256 | True 2022-12-04 15:46:39.552 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.src_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.553 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.src_attn.linear_q.bias | [256] | 256 | True 2022-12-04 15:46:39.553 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.src_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.553 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.src_attn.linear_k.bias | [256] | 256 | True 2022-12-04 15:46:39.554 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.src_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.554 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.src_attn.linear_v.bias | [256] | 256 | True 2022-12-04 15:46:39.554 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.src_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.555 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.src_attn.linear_out.bias | [256] | 256 | True 2022-12-04 15:46:39.555 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.feed_forward.w_1.weight | [256, 2048] | 524288 | True 2022-12-04 15:46:39.556 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.feed_forward.w_1.bias | [2048] | 2048 | True 2022-12-04 15:46:39.556 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.feed_forward.w_2.weight | [2048, 256] | 524288 | True 2022-12-04 15:46:39.556 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.feed_forward.w_2.bias | [256] | 256 | True 2022-12-04 15:46:39.557 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.norm1.weight | [256] | 256 | True 2022-12-04 15:46:39.557 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.norm1.bias | [256] | 256 | True 2022-12-04 15:46:39.557 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.norm2.weight | [256] | 256 | True 2022-12-04 15:46:39.558 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.norm2.bias | [256] | 256 | True 2022-12-04 15:46:39.558 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.norm3.weight | [256] | 256 | True 2022-12-04 15:46:39.558 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.norm3.bias | [256] | 256 | True 2022-12-04 15:46:39.559 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.concat_linear1.weight | [512, 256] | 131072 | True 2022-12-04 15:46:39.559 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.concat_linear1.bias | [256] | 256 | True 2022-12-04 15:46:39.559 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.concat_linear2.weight | [512, 256] | 131072 | True 2022-12-04 15:46:39.560 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.concat_linear2.bias | [256] | 256 | True 2022-12-04 15:46:39.560 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.self_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.560 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.self_attn.linear_q.bias | [256] | 256 | True 2022-12-04 15:46:39.561 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.self_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.561 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.self_attn.linear_k.bias | [256] | 256 | True 2022-12-04 15:46:39.561 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.self_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.562 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.self_attn.linear_v.bias | [256] | 256 | True 2022-12-04 15:46:39.562 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.self_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.563 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.self_attn.linear_out.bias | [256] | 256 | True 2022-12-04 15:46:39.563 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.src_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.563 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.src_attn.linear_q.bias | [256] | 256 | True 2022-12-04 15:46:39.564 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.src_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.564 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.src_attn.linear_k.bias | [256] | 256 | True 2022-12-04 15:46:39.564 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.src_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.565 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.src_attn.linear_v.bias | [256] | 256 | True 2022-12-04 15:46:39.565 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.src_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.565 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.src_attn.linear_out.bias | [256] | 256 | True 2022-12-04 15:46:39.566 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.feed_forward.w_1.weight | [256, 2048] | 524288 | True 2022-12-04 15:46:39.566 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.feed_forward.w_1.bias | [2048] | 2048 | True 2022-12-04 15:46:39.566 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.feed_forward.w_2.weight | [2048, 256] | 524288 | True 2022-12-04 15:46:39.567 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.feed_forward.w_2.bias | [256] | 256 | True 2022-12-04 15:46:39.567 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.norm1.weight | [256] | 256 | True 2022-12-04 15:46:39.567 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.norm1.bias | [256] | 256 | True 2022-12-04 15:46:39.567 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.norm2.weight | [256] | 256 | True 2022-12-04 15:46:39.568 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.norm2.bias | [256] | 256 | True 2022-12-04 15:46:39.568 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.norm3.weight | [256] | 256 | True 2022-12-04 15:46:39.568 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.norm3.bias | [256] | 256 | True 2022-12-04 15:46:39.569 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.concat_linear1.weight | [512, 256] | 131072 | True 2022-12-04 15:46:39.569 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.concat_linear1.bias | [256] | 256 | True 2022-12-04 15:46:39.569 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.concat_linear2.weight | [512, 256] | 131072 | True 2022-12-04 15:46:39.570 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.concat_linear2.bias | [256] | 256 | True 2022-12-04 15:46:39.570 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.self_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.570 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.self_attn.linear_q.bias | [256] | 256 | True 2022-12-04 15:46:39.571 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.self_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.571 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.self_attn.linear_k.bias | [256] | 256 | True 2022-12-04 15:46:39.571 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.self_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.572 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.self_attn.linear_v.bias | [256] | 256 | True 2022-12-04 15:46:39.572 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.self_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.572 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.self_attn.linear_out.bias | [256] | 256 | True 2022-12-04 15:46:39.572 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.src_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.573 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.src_attn.linear_q.bias | [256] | 256 | True 2022-12-04 15:46:39.573 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.src_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.573 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.src_attn.linear_k.bias | [256] | 256 | True 2022-12-04 15:46:39.574 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.src_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.574 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.src_attn.linear_v.bias | [256] | 256 | True 2022-12-04 15:46:39.574 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.src_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.575 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.src_attn.linear_out.bias | [256] | 256 | True 2022-12-04 15:46:39.575 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.feed_forward.w_1.weight | [256, 2048] | 524288 | True 2022-12-04 15:46:39.575 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.feed_forward.w_1.bias | [2048] | 2048 | True 2022-12-04 15:46:39.575 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.feed_forward.w_2.weight | [2048, 256] | 524288 | True 2022-12-04 15:46:39.576 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.feed_forward.w_2.bias | [256] | 256 | True 2022-12-04 15:46:39.576 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.norm1.weight | [256] | 256 | True 2022-12-04 15:46:39.576 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.norm1.bias | [256] | 256 | True 2022-12-04 15:46:39.577 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.norm2.weight | [256] | 256 | True 2022-12-04 15:46:39.577 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.norm2.bias | [256] | 256 | True 2022-12-04 15:46:39.577 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.norm3.weight | [256] | 256 | True 2022-12-04 15:46:39.578 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.norm3.bias | [256] | 256 | True 2022-12-04 15:46:39.578 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.concat_linear1.weight | [512, 256] | 131072 | True 2022-12-04 15:46:39.579 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.concat_linear1.bias | [256] | 256 | True 2022-12-04 15:46:39.579 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.concat_linear2.weight | [512, 256] | 131072 | True 2022-12-04 15:46:39.579 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.concat_linear2.bias | [256] | 256 | True 2022-12-04 15:46:39.580 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.self_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.580 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.self_attn.linear_q.bias | [256] | 256 | True 2022-12-04 15:46:39.580 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.self_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.580 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.self_attn.linear_k.bias | [256] | 256 | True 2022-12-04 15:46:39.581 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.self_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.581 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.self_attn.linear_v.bias | [256] | 256 | True 2022-12-04 15:46:39.581 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.self_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.582 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.self_attn.linear_out.bias | [256] | 256 | True 2022-12-04 15:46:39.582 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.src_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.582 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.src_attn.linear_q.bias | [256] | 256 | True 2022-12-04 15:46:39.583 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.src_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.583 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.src_attn.linear_k.bias | [256] | 256 | True 2022-12-04 15:46:39.583 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.src_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.584 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.src_attn.linear_v.bias | [256] | 256 | True 2022-12-04 15:46:39.584 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.src_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.584 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.src_attn.linear_out.bias | [256] | 256 | True 2022-12-04 15:46:39.585 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.feed_forward.w_1.weight | [256, 2048] | 524288 | True 2022-12-04 15:46:39.585 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.feed_forward.w_1.bias | [2048] | 2048 | True 2022-12-04 15:46:39.585 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.feed_forward.w_2.weight | [2048, 256] | 524288 | True 2022-12-04 15:46:39.586 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.feed_forward.w_2.bias | [256] | 256 | True 2022-12-04 15:46:39.586 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.norm1.weight | [256] | 256 | True 2022-12-04 15:46:39.586 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.norm1.bias | [256] | 256 | True 2022-12-04 15:46:39.587 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.norm2.weight | [256] | 256 | True 2022-12-04 15:46:39.587 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.norm2.bias | [256] | 256 | True 2022-12-04 15:46:39.587 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.norm3.weight | [256] | 256 | True 2022-12-04 15:46:39.588 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.norm3.bias | [256] | 256 | True 2022-12-04 15:46:39.588 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.concat_linear1.weight | [512, 256] | 131072 | True 2022-12-04 15:46:39.588 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.concat_linear1.bias | [256] | 256 | True 2022-12-04 15:46:39.588 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.concat_linear2.weight | [512, 256] | 131072 | True 2022-12-04 15:46:39.589 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.concat_linear2.bias | [256] | 256 | True 2022-12-04 15:46:39.589 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.self_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.589 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.self_attn.linear_q.bias | [256] | 256 | True 2022-12-04 15:46:39.590 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.self_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.590 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.self_attn.linear_k.bias | [256] | 256 | True 2022-12-04 15:46:39.590 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.self_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.591 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.self_attn.linear_v.bias | [256] | 256 | True 2022-12-04 15:46:39.591 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.self_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.591 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.self_attn.linear_out.bias | [256] | 256 | True 2022-12-04 15:46:39.592 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.src_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.592 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.src_attn.linear_q.bias | [256] | 256 | True 2022-12-04 15:46:39.592 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.src_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.592 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.src_attn.linear_k.bias | [256] | 256 | True 2022-12-04 15:46:39.593 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.src_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.593 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.src_attn.linear_v.bias | [256] | 256 | True 2022-12-04 15:46:39.593 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.src_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.594 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.src_attn.linear_out.bias | [256] | 256 | True 2022-12-04 15:46:39.594 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.feed_forward.w_1.weight | [256, 2048] | 524288 | True 2022-12-04 15:46:39.594 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.feed_forward.w_1.bias | [2048] | 2048 | True 2022-12-04 15:46:39.595 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.feed_forward.w_2.weight | [2048, 256] | 524288 | True 2022-12-04 15:46:39.595 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.feed_forward.w_2.bias | [256] | 256 | True 2022-12-04 15:46:39.595 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.norm1.weight | [256] | 256 | True 2022-12-04 15:46:39.596 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.norm1.bias | [256] | 256 | True 2022-12-04 15:46:39.596 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.norm2.weight | [256] | 256 | True 2022-12-04 15:46:39.596 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.norm2.bias | [256] | 256 | True 2022-12-04 15:46:39.597 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.norm3.weight | [256] | 256 | True 2022-12-04 15:46:39.597 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.norm3.bias | [256] | 256 | True 2022-12-04 15:46:39.597 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.concat_linear1.weight | [512, 256] | 131072 | True 2022-12-04 15:46:39.598 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.concat_linear1.bias | [256] | 256 | True 2022-12-04 15:46:39.598 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.concat_linear2.weight | [512, 256] | 131072 | True 2022-12-04 15:46:39.598 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.concat_linear2.bias | [256] | 256 | True 2022-12-04 15:46:39.599 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.self_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.599 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.self_attn.linear_q.bias | [256] | 256 | True 2022-12-04 15:46:39.599 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.self_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.600 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.self_attn.linear_k.bias | [256] | 256 | True 2022-12-04 15:46:39.600 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.self_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.600 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.self_attn.linear_v.bias | [256] | 256 | True 2022-12-04 15:46:39.601 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.self_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.601 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.self_attn.linear_out.bias | [256] | 256 | True 2022-12-04 15:46:39.602 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.src_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.602 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.src_attn.linear_q.bias | [256] | 256 | True 2022-12-04 15:46:39.602 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.src_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.603 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.src_attn.linear_k.bias | [256] | 256 | True 2022-12-04 15:46:39.603 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.src_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.603 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.src_attn.linear_v.bias | [256] | 256 | True 2022-12-04 15:46:39.604 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.src_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-04 15:46:39.604 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.src_attn.linear_out.bias | [256] | 256 | True 2022-12-04 15:46:39.604 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.feed_forward.w_1.weight | [256, 2048] | 524288 | True 2022-12-04 15:46:39.605 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.feed_forward.w_1.bias | [2048] | 2048 | True 2022-12-04 15:46:39.605 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.feed_forward.w_2.weight | [2048, 256] | 524288 | True 2022-12-04 15:46:39.605 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.feed_forward.w_2.bias | [256] | 256 | True 2022-12-04 15:46:39.606 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.norm1.weight | [256] | 256 | True 2022-12-04 15:46:39.606 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.norm1.bias | [256] | 256 | True 2022-12-04 15:46:39.606 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.norm2.weight | [256] | 256 | True 2022-12-04 15:46:39.606 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.norm2.bias | [256] | 256 | True 2022-12-04 15:46:39.607 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.norm3.weight | [256] | 256 | True 2022-12-04 15:46:39.607 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.norm3.bias | [256] | 256 | True 2022-12-04 15:46:39.608 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.concat_linear1.weight | [512, 256] | 131072 | True 2022-12-04 15:46:39.608 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.concat_linear1.bias | [256] | 256 | True 2022-12-04 15:46:39.608 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.concat_linear2.weight | [512, 256] | 131072 | True 2022-12-04 15:46:39.608 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.concat_linear2.bias | [256] | 256 | True 2022-12-04 15:46:39.609 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - ctc.ctc_lo.weight | [256, 4233] | 1083648 | True 2022-12-04 15:46:39.609 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - ctc.ctc_lo.bias | [4233] | 4233 | True 2022-12-04 15:46:39.609 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:60 - Total parameters: 411.0, 31.95M elements. 2022-12-04 15:46:39.610 | INFO | paddlespeech.s2t.exps.u2.model:setup_model:262 - Setup model! 2022-12-04 15:46:39.610 | INFO | paddlespeech.s2t.utils.dynamic_import:instance_class:68 - Instance: WarmupLR {'learning_rate': 0.002, 'verbose': False, 'warmup_steps': 25000}. 2022-12-04 15:46:39.638 | INFO | paddlespeech.s2t.training.optimizer:from_args:109 - <WeightDecay - L2Decay, regularization_coeff=0.000001> 2022-12-04 15:46:39.639 | INFO | paddlespeech.s2t.training.optimizer:from_args:111 - <GradClip - Gradient Clip By GlobalNorm, global_norm=5.000000> 2022-12-04 15:46:39.639 | INFO | paddlespeech.s2t.utils.dynamic_import:instance_class:68 - Instance: Adam {'grad_clip': ClipGradByGlobalNormWithLog(global_clip_norm=5.0), 'weight_decay': <paddle.regularizer.L2Decay object at 0x7f849418c550>, 'learning_rate': WarmupLR(warmup_steps=25000, lr=0.002, last_epoch=0)}. 2022-12-04 15:46:39.639 | INFO | paddlespeech.s2t.training.optimizer:from_args:120 - LR: WarmupLR(warmup_steps=25000, lr=0.002, last_epoch=0) 2022-12-04 15:46:39.640 | INFO | paddlespeech.s2t.exps.u2.model:setup_model:308 - Setup optimizer/lr_scheduler! 2022-12-04 15:46:39.640 | INFO | paddlespeech.s2t.training.trainer:resume_or_scratch:221 - Init from scratch! 2022-12-04 15:46:40.094 | INFO | paddlespeech.s2t.utils.checkpoint:_save_parameters:286 - Saved model to exp/transformer/checkpoints/init.pdparams 2022-12-04 15:46:40.097 | INFO | paddlespeech.s2t.utils.checkpoint:_save_parameters:292 - Saved optimzier state to exp/transformer/checkpoints/init.pdopt 2022-12-04 15:46:40.100 | INFO | paddlespeech.s2t.exps.u2.model:do_train:161 - Train Total Examples: 15013

------------------------------------------------------以上是log里得内容,下面是出现在terminal的报错资讯---------------------------------------------

/home/navy/PaddleSpeech/paddlespeech/audio/transform/spec_augment.py:49: DeprecationWarning: BICUBIC is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BICUBIC instead. Image.BICUBIC) /home/navy/PaddleSpeech/paddlespeech/audio/transform/spec_augment.py:51: DeprecationWarning: BICUBIC is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BICUBIC instead. Image.BICUBIC) /home/navy/PaddleSpeech/paddlespeech/audio/transform/spec_augment.py:49: DeprecationWarning: BICUBIC is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BICUBIC instead. Image.BICUBIC) /home/navy/PaddleSpeech/paddlespeech/audio/transform/spec_augment.py:51: DeprecationWarning: BICUBIC is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BICUBIC instead. Image.BICUBIC)

C++ Traceback (most recent call last): 0 arange_ad_func(paddle::experimental::Tensor const&, paddle::experimental::Tensor const&, paddle::experimental::Tensor const&, paddle::experimental::DataType, phi::Place) 1 paddle::experimental::arange(paddle::experimental::Tensor const&, paddle::experimental::Tensor const&, paddle::experimental::Tensor const&, paddle::experimental::DataType, phi::Place const&) 2 void phi::ArangeKernel<long, phi::GPUContext>(phi::GPUContext const&, phi::DenseTensor const&, phi::DenseTensor const&, phi::DenseTensor const&, phi::DenseTensor*)

Error Message Summary: FatalError: Erroneous arithmetic operation is detected by the operating system. [TimeInfo: Aborted at 1670140000 (unix time) try "date -d @1670140000" if you are using GNU date ] [SignalInfo: SIGFPE (@0x7f8531fbd16d) received by PID 3825 (TID 0x7f85a9b48200) from PID 838586733 ]

LAUNCH INFO 2022-12-04 15:46:41,366 Pod failed LAUNCH ERROR 2022-12-04 15:46:41,366 Container failed !!! Container rank 0 status failed cmd ['/home/navy/PaddleSpeech/tools/venv/bin/python3', '-u', '/home/navy/PaddleSpeech/paddlespeech/s2t/exps/u2/bin/train.py', '--ngpu', '1', '--seed', '0', '--config', 'conf/transformer.yaml', '--output', 'exp/transformer', '--profiler-options', '', '--benchmark-batch-size', '0', '--benchmark-max-step', '0'] code -8 log log/workerlog.0 env {'CLUTTER_IM_MODULE': 'xim', 'CONDA_SHLVL': '2', 'LC_ALL': 'C', 'LS_COLORS': 'rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:mi=00:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:.tar=01;31:.tgz=01;31:.arc=01;31:.arj=01;31:.taz=01;31:.lha=01;31:.lz4=01;31:.lzh=01;31:.lzma=01;31:.tlz=01;31:.txz=01;31:.tzo=01;31:.t7z=01;31:.zip=01;31:.z=01;31:.Z=01;31:.dz=01;31:.gz=01;31:.lrz=01;31:.lz=01;31:.lzo=01;31:.xz=01;31:.zst=01;31:.tzst=01;31:.bz2=01;31:.bz=01;31:.tbz=01;31:.tbz2=01;31:.tz=01;31:.deb=01;31:.rpm=01;31:.jar=01;31:.war=01;31:.ear=01;31:.sar=01;31:.rar=01;31:.alz=01;31:.ace=01;31:.zoo=01;31:.cpio=01;31:.7z=01;31:.rz=01;31:.cab=01;31:.wim=01;31:.swm=01;31:.dwm=01;31:.esd=01;31:.jpg=01;35:.jpeg=01;35:.mjpg=01;35:.mjpeg=01;35:.gif=01;35:.bmp=01;35:.pbm=01;35:.pgm=01;35:.ppm=01;35:.tga=01;35:.xbm=01;35:.xpm=01;35:.tif=01;35:.tiff=01;35:.png=01;35:.svg=01;35:.svgz=01;35:.mng=01;35:.pcx=01;35:.mov=01;35:.mpg=01;35:.mpeg=01;35:.m2v=01;35:.mkv=01;35:.webm=01;35:.ogm=01;35:.mp4=01;35:.m4v=01;35:.mp4v=01;35:.vob=01;35:.qt=01;35:.nuv=01;35:.wmv=01;35:.asf=01;35:.rm=01;35:.rmvb=01;35:.flc=01;35:.avi=01;35:.fli=01;35:.flv=01;35:.gl=01;35:.dl=01;35:.xcf=01;35:.xwd=01;35:.yuv=01;35:.cgm=01;35:.emf=01;35:.ogv=01;35:.ogx=01;35:.aac=00;36:.au=00;36:.flac=00;36:.m4a=00;36:.mid=00;36:.midi=00;36:.mka=00;36:.mp3=00;36:.mpc=00;36:.ogg=00;36:.ra=00;36:.wav=00;36:.oga=00;36:.opus=00;36:.spx=00;36:.xspf=00;36:', 'LD_LIBRARY_PATH': '/usr/local/cuda-10.2/lib64:/usr/local/cuda-10.2/lib64::/usr/local/lib/:/home/navy/PaddleSpeech/tools/liblbfgs-1.10/lib/.libs', 'CONDA_EXE': '/home/navy/miniconda3/bin/conda', 'LC_MEASUREMENT': 'en_ZW.UTF-8', 'LESSCLOSE': '/usr/bin/lesspipe %s %s', 'LC_PAPER': 'en_ZW.UTF-8', 'LC_MONETARY': 'en_ZW.UTF-8', 'XDG_MENU_PREFIX': 'gnome-', 'LANG': 'en_US.UTF-8', 'DISPLAY': ':1', 'ORIGINAL_XDG_CURRENT_DESKTOP': 'ubuntu:GNOME', 'FLAGS_allocator_strategy': 'naive_best_fit', 'GNOME_SHELL_SESSION_MODE': 'ubuntu', 'COLORTERM': 'truecolor', 'USERNAME': 'navy', 'CONDA_PREFIX': '/home/navy/PaddleSpeech/tools/venv', 'VSCODE_GIT_ASKPASS_EXTRA_ARGS': '--ms-enable-electron-run-as-node', 'CHROME_DESKTOP': 'code-url-handler.desktop', 'XDG_VTNR': '2', 'GIO_LAUNCHED_DESKTOP_FILE_PID': '2043', 'PYTHONIOENCODING': 'UTF-8', 'SSH_AUTH_SOCK': '/run/user/1000/keyring/ssh', 'MAIN_ROOT': '/home/navy/PaddleSpeech', 'MANDATORY_PATH': '/usr/share/gconf/ubuntu.mandatory.path', '_CE_M': '', 'LC_NAME': 'en_ZW.UTF-8', 'XDG_SESSION_ID': '3', 'USER': 'navy', 'CONDA_PREFIX_1': '/home/navy/miniconda3', 'DESKTOP_SESSION': 'ubuntu', 'QT4_IM_MODULE': 'xim', 'TEXTDOMAINDIR': '/usr/share/locale/', 'DEFAULTS_PATH': '/usr/share/gconf/ubuntu.default.path', 'PWD': '/home/navy/PaddleSpeech/examples/aishell/asr1', 'HOME': '/home/navy', 'CONDA_PYTHON_EXE': '/home/navy/miniconda3/bin/python', 'VSCODE_GIT_ASKPASS_NODE': '/usr/share/code/code', 'TEXTDOMAIN': 'im-config', 'SSH_AGENT_PID': '1683', 'TERM_PROGRAM': 'vscode', 'TERM_PROGRAM_VERSION': '1.73.1', 'QT_ACCESSIBILITY': '1', 'XDG_SESSION_TYPE': 'x11', 'XDG_DATA_DIRS': '/usr/share/ubuntu:/usr/local/share/:/usr/share/:/var/lib/snapd/desktop', 'CE_CONDA': '', 'XDG_SESSION_DESKTOP': 'ubuntu', 'LC_ADDRESS': 'en_ZW.UTF-8', 'GJS_DEBUG_OUTPUT': 'stderr', 'LC_NUMERIC': 'en_ZW.UTF-8', 'SRILM': '/home/navy/PaddleSpeech/tools/srilm', 'CONDA_PROMPT_MODIFIER': '(/home/navy/PaddleSpeech/tools/venv) ', 'GTK_MODULES': 'gail:atk-bridge', 'LIBLBFGS': '/home/navy/PaddleSpeech/tools/liblbfgs-1.10', 'PAPERSIZE': 'a4', 'VSCODE_GIT_ASKPASS_MAIN': '/usr/share/code/resources/app/extensions/git/dist/askpass-main.js', 'KALDI_ROOT': '/home/navy/PaddleSpeech/tools/kaldi', 'WINDOWPATH': '2', 'TERM': 'xterm-256color', 'SHELL': '/bin/bash', 'QT_IM_MODULE': 'xim', 'XMODIFIERS': '@im=ibus', 'IM_CONFIG_PHASE': '2', 'XDG_CURRENT_DESKTOP': 'Unity', 'GPG_AGENT_INFO': '/run/user/1000/gnupg/S.gpg-agent:0:1', 'BIN_DIR': '/home/navy/PaddleSpeech/paddlespeech/s2t/exps/u2/bin', 'GIO_LAUNCHED_DESKTOP_FILE': '/usr/share/applications/code.desktop', 'CUDA_VISIBLE_DEVICES': '0', 'PYTHONDONTWRITEBYTECODE': '1', 'SHLVL': '4', 'XDG_SEAT': 'seat0', 'PYTHONPATH': '/home/navy/PaddleSpeech:', 'VSCODE_GIT_IPC_HANDLE': '/run/user/1000/vscode-git-9cff25a19a.sock', 'LC_TELEPHONE': 'en_ZW.UTF-8', 'GDK_BACKEND': 'x11', 'GDMSESSION': 'ubuntu', 'GNOME_DESKTOP_SESSION_ID': 'this-is-deprecated', 'LOGNAME': 'navy', 'DBUS_SESSION_BUS_ADDRESS': 'unix:path=/run/user/1000/bus', 'GIT_ASKPASS': '/usr/share/code/resources/app/extensions/git/dist/askpass.sh', 'XDG_RUNTIME_DIR': '/run/user/1000', 'XAUTHORITY': '/run/user/1000/gdm/Xauthority', 'XDG_CONFIG_DIRS': '/etc/xdg/xdg-ubuntu:/etc/xdg', 'PATH': '/home/navy/PaddleSpeech/examples/aishell/asr1/utils/:/home/navy/PaddleSpeech/tools/kaldi/tools/openfst/bin:/home/navy/PaddleSpeech/examples/aishell/asr1:/home/navy/PaddleSpeech:/home/navy/PaddleSpeech/utils:/usr/local/cuda-10.2/bin:/usr/local/cuda-10.2/bin:/home/navy/PaddleSpeech/tools/venv/bin:/home/navy/miniconda3/condabin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/home/navy/PaddleSpeech/tools/srilm/bin:/home/navy/PaddleSpeech/tools/srilm/bin/i686-m64', 'LC_IDENTIFICATION': 'en_ZW.UTF-8', 'CONDA_DEFAULT_ENV': '/home/navy/PaddleSpeech/tools/venv', 'GJS_DEBUG_TOPICS': 'JS ERROR;JS LOG', 'SESSION_MANAGER': 'local/navy:@/tmp/.ICE-unix/1605,unix/navy:/tmp/.ICE-unix/1605', 'LESSOPEN': '| /usr/bin/lesspipe %s', 'GTK_IM_MODULE': 'ibus', 'LC_TIME': 'en_ZW.UTF-8', '': '/home/navy/PaddleSpeech/tools/venv/bin/python3', 'CUSTOM_DEVICE_ROOT': '', 'OMP_NUM_THREADS': '1', 'POD_NAME': 'defgpl', 'PADDLE_MASTER': '127.0.1.1:49414', 'PADDLE_GLOBAL_SIZE': '1', 'PADDLE_LOCAL_SIZE': '1', 'PADDLE_GLOBAL_RANK': '0', 'PADDLE_LOCAL_RANK': '0', 'PADDLE_NNODES': '1', 'PADDLE_TRAINER_ENDPOINTS': '127.0.1.1:49415', 'PADDLE_CURRENT_ENDPOINT': '127.0.1.1:49415', 'PADDLE_TRAINER_ID': '0', 'PADDLE_TRAINERS_NUM': '1', 'PADDLE_RANK_IN_NODE': '0', 'FLAGS_selected_gpus': '0'} LAUNCH INFO 2022-12-04 15:46:41,366 ------------------------- ERROR LOG DETAIL ------------------------- | INFO | paddlespeech.s2t.utils.dynamic_import:instance_class:68 - Instance: Adam {'grad_clip': ClipGradByGlobalNormWithLog(global_clip_norm=5.0), 'weight_decay': <paddle.regularizer.L2Decay object at 0x7f849418c550>, 'learning_rate': WarmupLR(warmup_steps=25000, lr=0.002, last_epoch=0)}. 2022-12-04 15:46:39.639 | INFO | paddlespeech.s2t.training.optimizer:from_args:120 - LR: WarmupLR(warmup_steps=25000, lr=0.002, last_epoch=0) 2022-12-04 15:46:39.640 | INFO | paddlespeech.s2t.exps.u2.model:setup_model:308 - Setup optimizer/lr_scheduler! 2022-12-04 15:46:39.640 | INFO | paddlespeech.s2t.training.trainer:resume_or_scratch:221 - Init from scratch! 2022-12-04 15:46:40.094 | INFO | paddlespeech.s2t.utils.checkpoint:_save_parameters:286 - Saved model to exp/transformer/checkpoints/init.pdparams 2022-12-04 15:46:40.097 | INFO | paddlespeech.s2t.utils.checkpoint:_save_parameters:292 - Saved optimzier state to exp/transformer/checkpoints/init.pdopt 2022-12-04 15:46:40.100 | INFO | paddlespeech.s2t.exps.u2.model:do_train:161 - Train Total Examples: 15013 /home/navy/PaddleSpeech/paddlespeech/audio/transform/spec_augment.py:49: DeprecationWarning: BICUBIC is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BICUBIC instead. Image.BICUBIC) /home/navy/PaddleSpeech/paddlespeech/audio/transform/spec_augment.py:51: DeprecationWarning: BICUBIC is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BICUBIC instead. Image.BICUBIC) /home/navy/PaddleSpeech/paddlespeech/audio/transform/spec_augment.py:49: DeprecationWarning: BICUBIC is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BICUBIC instead. Image.BICUBIC) /home/navy/PaddleSpeech/paddlespeech/audio/transform/spec_augment.py:51: DeprecationWarning: BICUBIC is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BICUBIC instead. Image.BICUBIC)

C++ Traceback (most recent call last): 0 arange_ad_func(paddle::experimental::Tensor const&, paddle::experimental::Tensor const&, paddle::experimental::Tensor const&, paddle::experimental::DataType, phi::Place) 1 paddle::experimental::arange(paddle::experimental::Tensor const&, paddle::experimental::Tensor const&, paddle::experimental::Tensor const&, paddle::experimental::DataType, phi::Place const&) 2 void phi::ArangeKernel<long, phi::GPUContext>(phi::GPUContext const&, phi::DenseTensor const&, phi::DenseTensor const&, phi::DenseTensor const&, phi::DenseTensor*)

Error Message Summary: FatalError: Erroneous arithmetic operation is detected by the operating system. [TimeInfo: Aborted at 1670140000 (unix time) try "date -d @1670140000" if you are using GNU date ] [SignalInfo: SIGFPE (@0x7f8531fbd16d) received by PID 3825 (TID 0x7f85a9b48200) from PID 838586733 ]

LAUNCH INFO 2022-12-04 15:46:41,367 Exit code -8

zxcd commented 1 year ago

Train Total Examples: 15013

看到你的样本数跟aishell的样本数不太一致,是否能使用aishell本来的数据集进行验证是否能够成功启动训练?

navy7913 commented 1 year ago

Train Total Examples: 15013 看到你的样本数跟aishell的样本数不太一致,是否能使用aishell本来的数据集进行验证是否能够成功启动训练?

您好,我所使用的数据集都是从bash run.sh --stage 1 --stop_stage 1下载下来的,没有使用其他的数据集,不太清楚为什么样本数会不同

zxcd commented 1 year ago

或许你需要先跑一下bash run.sh --stage 0 --stop_stage 0 下载一下aishell的训练数据?

navy7913 commented 1 year ago

或许你需要先跑一下bash run.sh --stage 0 --stop_stage 0 下载一下aishell的训练数据?

您好,不好意思我上一则留言打错了,我想表达的是目前我的资料集都是来自bash run.sh --stage 0 --stop_stage 0 所下载的

navy7913 commented 1 year ago

您好,这是我从新执行bash run.sh --stage 0 --stop-stage 0的結果中间有出现警告想请教您这该如何解决 The standard file /home/navy/PaddleSpeech/tools/kaldi/tools/config/common_path.sh is not present, can not using Kaldi! checkpoint name transformer Skip downloading and unpacking. Data already exists in /home/navy/PaddleSpeech/dataset/aishell. Creating manifest data/manifest ... Skip downloading and unpacking. Data already exists in /home/navy/PaddleSpeech/dataset/aishell. Data download and manifest prepare done! ----------- compute_mean_std.py Arguments ----------- delta_delta: 0 feat_dim: 80 manifest_path: data/manifest.train.raw num_samples: -1 num_workers: 12 output_path: data/mean_std.json sample_rate: 16000 spectrum_type: fbank stride_ms: 10 target_dB: -20 use_dB_normalization: 0 window_ms: 25

2022-12-09 16:44:05.123 | INFO | paddlespeech.s2t.frontend.augmentor.augmentation:init:123 - Augmentation: [] 2022-12-09 16:44:22.770 | INFO | paddlespeech.s2t.frontend.normalizer:_compute_mean_std:192 - process 8000 wavs,3688073 frames. 2022-12-09 16:44:39.484 | INFO | paddlespeech.s2t.frontend.normalizer:_compute_mean_std:192 - process 16000 wavs,7199968 frames. 2022-12-09 16:44:54.766 | INFO | paddlespeech.s2t.frontend.normalizer:_compute_mean_std:192 - process 24000 wavs,10688301 frames. 2022-12-09 16:45:11.791 | INFO | paddlespeech.s2t.frontend.normalizer:_compute_mean_std:192 - process 32000 wavs,14157741 frames. 2022-12-09 16:45:27.646 | INFO | paddlespeech.s2t.frontend.normalizer:_compute_mean_std:192 - process 40000 wavs,17775539 frames. 2022-12-09 16:45:43.243 | INFO | paddlespeech.s2t.frontend.normalizer:_compute_mean_std:192 - process 48000 wavs,21292305 frames. 2022-12-09 16:46:00.529 | INFO | paddlespeech.s2t.frontend.normalizer:_compute_mean_std:192 - process 56000 wavs,24833602 frames. 2022-12-09 16:46:17.122 | INFO | paddlespeech.s2t.frontend.normalizer:_compute_mean_std:192 - process 64000 wavs,28489880 frames. 2022-12-09 16:46:33.774 | INFO | paddlespeech.s2t.frontend.normalizer:_compute_mean_std:192 - process 72000 wavs,32069993 frames. 2022-12-09 16:46:49.076 | INFO | paddlespeech.s2t.frontend.normalizer:_compute_mean_std:192 - process 80000 wavs,35628306 frames. 2022-12-09 16:47:06.010 | INFO | paddlespeech.s2t.frontend.normalizer:_compute_mean_std:192 - process 88000 wavs,39292439 frames. 2022-12-09 16:47:23.016 | INFO | paddlespeech.s2t.frontend.normalizer:_compute_mean_std:192 - process 96000 wavs,42968389 frames. 2022-12-09 16:47:40.020 | INFO | paddlespeech.s2t.frontend.normalizer:_compute_mean_std:192 - process 104000 wavs,46688753 frames. 2022-12-09 16:47:57.130 | INFO | paddlespeech.s2t.frontend.normalizer:_compute_mean_std:192 - process 112000 wavs,50370938 frames. 2022-12-09 16:48:12.578 | INFO | paddlespeech.s2t.frontend.normalizer:_compute_mean_std:192 - process 120000 wavs,54011373 frames. ----------- build_vocab.py Arguments ----------- count_threshold: 0 manifest_paths: ['data/manifest.train.raw'] spm_character_coverage: 0.9995 spm_mode: unigram spm_model_prefix: spm_vocab_size: 0 text_keys: text unit_type: char vocab_path: data/lang_char/vocab.txt

2022-12-09 16:48:14.812 | WARNING | paddlespeech.s2t.frontend.featurizer.text_featurizer:init:57 - TextFeaturizer: not have vocab file or vocab list. ----------- format_data.py Arguments ----------- cmvn_path: data/mean_std.json manifest_paths: ['data/manifest.dev.raw'] output_path: data/manifest.dev spm_model_prefix: None unit_type: char vocab_path: data/lang_char/vocab.txt

Feature dim: 80 ----------- format_data.py Arguments ----------- cmvn_path: data/mean_std.json manifest_paths: ['data/manifest.train.raw'] output_path: data/manifest.train spm_model_prefix: None unit_type: char vocab_path: data/lang_char/vocab.txt

Feature dim: 80 Vocab size: 4233 Vocab size: 4233 ----------- format_data.py Arguments ----------- cmvn_path: data/mean_std.json manifest_paths: ['data/manifest.test.raw'] output_path: data/manifest.test spm_model_prefix: None unit_type: char vocab_path: data/lang_char/vocab.txt

Feature dim: 80 Vocab size: 4233 ['data/manifest.test.raw'] Examples number: 7176 ['data/manifest.dev.raw'] Examples number: 14326 ['data/manifest.train.raw'] Examples number: 120098 Aishell data preparation done.

zxcd commented 1 year ago

我目前使用Ubuntu 16.04.7,CUDA Version 10.2.89,paddle版本2.4.0-rc0,目前未复现到相同问题。 以下是我的操作步骤: ---paddle安装----

  1. python3 -m pip install paddlepaddle-gpu==2.4.0rc0 -i https://mirror.baidu.com/pypi/simple

  2. git clone https://github.com/PaddlePaddle/PaddleSpeech.git ; cd PaddleSpeech

  3. pip install pytest-runner -i https://pypi.tuna.tsinghua.edu.cn/simple

  4. pip install . -i https://pypi.tuna.tsinghua.edu.cn/simple

---执行训练---

  1. cd examples/aishell/asr1
  2. bash run.sh --stage 0 --stop_stage 1 --conf_path conf/transformer.yaml

或许你可以尝试使用类似的环境和操作步骤进行训练?

zxcd commented 1 year ago

目前你在数据处理时的警告应该是没有什么问题的

navy7913 commented 1 year ago

您好,我根据您提供的环境自己执行训练了,不知道什么原因,还是出了问题无法训练,非常不好意思想再次请教您该怎么解决,以下是我的log还有报错 环境:Ubuntu 16.04.7,CUDA Version 10.2.89,paddle版本2.4.0-rc0 , cudnn7.6.5 , Python 3.9.12

2022-12-12 18:37:29.749 | DEBUG | paddlespeech.s2t::41 - register user softmax to paddle, remove this when fixed! 2022-12-12 18:37:29.750 | DEBUG | paddlespeech.s2t::45 - register user log_softmax to paddle, remove this when fixed! 2022-12-12 18:37:29.750 | DEBUG | paddlespeech.s2t::49 - register user sigmoid to paddle, remove this when fixed! 2022-12-12 18:37:29.750 | DEBUG | paddlespeech.s2t::53 - register user log_sigmoid to paddle, remove this when fixed! 2022-12-12 18:37:29.750 | DEBUG | paddlespeech.s2t::57 - register user relu to paddle, remove this when fixed! 2022-12-12 18:37:29.750 | DEBUG | paddlespeech.s2t::66 - override cat of paddle if exists or register, remove this when fixed! 2022-12-12 18:37:29.750 | DEBUG | paddlespeech.s2t::88 - override long of paddle.Tensor if exists or register, remove this when fixed! 2022-12-12 18:37:29.750 | DEBUG | paddlespeech.s2t::110 - override new_full of paddle.Tensor if exists or register, remove this when fixed! 2022-12-12 18:37:29.750 | DEBUG | paddlespeech.s2t::122 - override contiguous of paddle.Tensor if exists or register, remove this when fixed! 2022-12-12 18:37:29.751 | DEBUG | paddlespeech.s2t::134 - register user view to paddle.Tensor, remove this when fixed! 2022-12-12 18:37:29.751 | DEBUG | paddlespeech.s2t::144 - register user view_as to paddle.Tensor, remove this when fixed! 2022-12-12 18:37:29.751 | DEBUG | paddlespeech.s2t::185 - register user masked_fill to paddle.Tensor, remove this when fixed! 2022-12-12 18:37:29.751 | DEBUG | paddlespeech.s2t::204 - register user maskedfill to paddle.Tensor, remove this when fixed! 2022-12-12 18:37:29.751 | DEBUG | paddlespeech.s2t::228 - register user repeat to paddle.Tensor, remove this when fixed! 2022-12-12 18:37:29.751 | DEBUG | paddlespeech.s2t::234 - register user softmax to paddle.Tensor, remove this when fixed! 2022-12-12 18:37:29.751 | DEBUG | paddlespeech.s2t::239 - register user sigmoid to paddle.Tensor, remove this when fixed! 2022-12-12 18:37:29.752 | DEBUG | paddlespeech.s2t::244 - register user relu to paddle.Tensor, remove this when fixed! 2022-12-12 18:37:29.752 | DEBUG | paddlespeech.s2t::253 - register user type_as to paddle.Tensor, remove this when fixed! 2022-12-12 18:37:29.752 | DEBUG | paddlespeech.s2t::270 - register user to to paddle.Tensor, remove this when fixed! 2022-12-12 18:37:29.752 | DEBUG | paddlespeech.s2t::280 - register user float to paddle.Tensor, remove this when fixed! 2022-12-12 18:37:29.752 | DEBUG | paddlespeech.s2t::291 - register user int to paddle.Tensor, remove this when fixed! 2022-12-12 18:37:30.428 | INFO | paddlespeech.s2t.utils.utility:all_version:45 - Deps Module Version:[('python', '3.9.12 (main, Apr 5 2022, 06:56:58) \n[GCC 7.5.0]'), ('paddle', '2.4.0-rc0'), ('paddle_commit', '083853cd4e4a9bdad22c70fa48eb9a036d2def27'), ('soundfile', '0.11.0')] 2022-12-12 18:37:30.428 | INFO | paddlespeech.s2t.training.trainer:init:116 - Rank: 0/1 2022-12-12 18:37:32.559 | INFO | paddlespeech.s2t.io.batchfy:make_batchset:400 - count is auto detected as seq 2022-12-12 18:37:32.716 | INFO | paddlespeech.s2t.io.batchfy:make_batchset:424 - # utts: 120098 2022-12-12 18:37:32.724 | INFO | paddlespeech.s2t.io.batchfy:make_batchset:467 - # minibatches: 1877 2022-12-12 18:37:32.825 | WARNING | paddlespeech.s2t.io.reader:init:73 - [Experimental feature] Some preprocessing will be done for the mini-batch creation using Transformation( 0: LogMelSpectrogramKaldi(fs=16000, n_mels=80, n_frame_shift=10.0, n_frame_length=25.0, dither=0.1)) 1: GlobalCMVN( cmvn_path=data/mean_std.json, norm_means=True, norm_vars=True,) 2: TimeWarp(max_time_warp=5, inplace=True, mode=PIL) 3: FreqMask(F=30, n_mask=2, replace_with_zero=False, inplace=True) 4: TimeMask(T=40, n_mask=2, replace_with_zero=False, inplace=True)) 2022-12-12 18:37:33.391 | INFO | paddlespeech.s2t.io.batchfy:make_batchset:400 - count is auto detected as seq 2022-12-12 18:37:33.407 | INFO | paddlespeech.s2t.io.batchfy:make_batchset:424 - # utts: 14326 2022-12-12 18:37:33.408 | INFO | paddlespeech.s2t.io.batchfy:make_batchset:467 - # minibatches: 224 2022-12-12 18:37:33.413 | WARNING | paddlespeech.s2t.io.reader:init:73 - [Experimental feature] Some preprocessing will be done for the mini-batch creation using Transformation( 0: LogMelSpectrogramKaldi(fs=16000, n_mels=80, n_frame_shift=10.0, n_frame_length=25.0, dither=0.1)) 1: GlobalCMVN( cmvn_path=data/mean_std.json, norm_means=True, norm_vars=True,) 2: TimeWarp(max_time_warp=5, inplace=True, mode=PIL) 3: FreqMask(F=30, n_mask=2, replace_with_zero=False, inplace=True) 4: TimeMask(T=40, n_mask=2, replace_with_zero=False, inplace=True)) 2022-12-12 18:37:33.414 | INFO | paddlespeech.s2t.exps.u2.model:setup_dataloader:233 - Setup train/valid Dataloader! 2022-12-12 18:37:33.414 | DEBUG | paddlespeech.s2t.models.u2.u2:_init_from_config:901 - U2 Encoder type: transformer 2022-12-12 18:37:33.630 | DEBUG | paddlespeech.s2t.models.u2.u2:_init_from_config:913 - U2 Decoder type: transformer 2022-12-12 18:37:33.826 | DEBUG | paddlespeech.s2t.modules.loss:init:40 - CTCLoss Loss reduction: sum, div-bs: True 2022-12-12 18:37:33.826 | DEBUG | paddlespeech.s2t.modules.loss:init:42 - CTCLoss Grad Norm Type: None 2022-12-12 18:37:33.826 | DEBUG | paddlespeech.s2t.modules.loss:init:73 - CTCLoss() kwargs:{'norm_by_times': False}, not support: {'norm_by_batchsize': False, 'norm_by_total_logits_len': False} 2022-12-12 18:37:33.829 | INFO | paddlespeech.s2t.exps.u2.model:setup_model:259 - U2Model( (encoder): TransformerEncoder( (embed): Conv2dSubsampling4( (pos_enc): PositionalEncoding( (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) ) (conv): Sequential( (0): Conv2D(1, 256, kernel_size=[3, 3], stride=[2, 2], data_format=NCHW) (1): ReLU() (2): Conv2D(256, 256, kernel_size=[3, 3], stride=[2, 2], data_format=NCHW) (3): ReLU() ) (out): Sequential( (0): Linear(in_features=4864, out_features=256, dtype=float32) ) ) (after_norm): LayerNorm(normalized_shape=[256], epsilon=1e-12) (encoders): LayerList( (0): TransformerEncoderLayer( (self_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (feed_forward): PositionwiseFeedForward( (w_1): Linear(in_features=256, out_features=2048, dtype=float32) (activation): ReLU() (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (w_2): Linear(in_features=2048, out_features=256, dtype=float32) ) (norm1): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm2): LayerNorm(normalized_shape=[256], epsilon=1e-12) (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (concat_linear): Linear(in_features=512, out_features=256, dtype=float32) ) (1): TransformerEncoderLayer( (self_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (feed_forward): PositionwiseFeedForward( (w_1): Linear(in_features=256, out_features=2048, dtype=float32) (activation): ReLU() (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (w_2): Linear(in_features=2048, out_features=256, dtype=float32) ) (norm1): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm2): LayerNorm(normalized_shape=[256], epsilon=1e-12) (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (concat_linear): Linear(in_features=512, out_features=256, dtype=float32) ) (2): TransformerEncoderLayer( (self_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (feed_forward): PositionwiseFeedForward( (w_1): Linear(in_features=256, out_features=2048, dtype=float32) (activation): ReLU() (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (w_2): Linear(in_features=2048, out_features=256, dtype=float32) ) (norm1): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm2): LayerNorm(normalized_shape=[256], epsilon=1e-12) (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (concat_linear): Linear(in_features=512, out_features=256, dtype=float32) ) (3): TransformerEncoderLayer( (self_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (feed_forward): PositionwiseFeedForward( (w_1): Linear(in_features=256, out_features=2048, dtype=float32) (activation): ReLU() (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (w_2): Linear(in_features=2048, out_features=256, dtype=float32) ) (norm1): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm2): LayerNorm(normalized_shape=[256], epsilon=1e-12) (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (concat_linear): Linear(in_features=512, out_features=256, dtype=float32) ) (4): TransformerEncoderLayer( (self_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (feed_forward): PositionwiseFeedForward( (w_1): Linear(in_features=256, out_features=2048, dtype=float32) (activation): ReLU() (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (w_2): Linear(in_features=2048, out_features=256, dtype=float32) ) (norm1): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm2): LayerNorm(normalized_shape=[256], epsilon=1e-12) (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (concat_linear): Linear(in_features=512, out_features=256, dtype=float32) ) (5): TransformerEncoderLayer( (self_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (feed_forward): PositionwiseFeedForward( (w_1): Linear(in_features=256, out_features=2048, dtype=float32) (activation): ReLU() (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (w_2): Linear(in_features=2048, out_features=256, dtype=float32) ) (norm1): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm2): LayerNorm(normalized_shape=[256], epsilon=1e-12) (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (concat_linear): Linear(in_features=512, out_features=256, dtype=float32) ) (6): TransformerEncoderLayer( (self_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (feed_forward): PositionwiseFeedForward( (w_1): Linear(in_features=256, out_features=2048, dtype=float32) (activation): ReLU() (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (w_2): Linear(in_features=2048, out_features=256, dtype=float32) ) (norm1): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm2): LayerNorm(normalized_shape=[256], epsilon=1e-12) (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (concat_linear): Linear(in_features=512, out_features=256, dtype=float32) ) (7): TransformerEncoderLayer( (self_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (feed_forward): PositionwiseFeedForward( (w_1): Linear(in_features=256, out_features=2048, dtype=float32) (activation): ReLU() (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (w_2): Linear(in_features=2048, out_features=256, dtype=float32) ) (norm1): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm2): LayerNorm(normalized_shape=[256], epsilon=1e-12) (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (concat_linear): Linear(in_features=512, out_features=256, dtype=float32) ) (8): TransformerEncoderLayer( (self_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (feed_forward): PositionwiseFeedForward( (w_1): Linear(in_features=256, out_features=2048, dtype=float32) (activation): ReLU() (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (w_2): Linear(in_features=2048, out_features=256, dtype=float32) ) (norm1): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm2): LayerNorm(normalized_shape=[256], epsilon=1e-12) (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (concat_linear): Linear(in_features=512, out_features=256, dtype=float32) ) (9): TransformerEncoderLayer( (self_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (feed_forward): PositionwiseFeedForward( (w_1): Linear(in_features=256, out_features=2048, dtype=float32) (activation): ReLU() (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (w_2): Linear(in_features=2048, out_features=256, dtype=float32) ) (norm1): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm2): LayerNorm(normalized_shape=[256], epsilon=1e-12) (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (concat_linear): Linear(in_features=512, out_features=256, dtype=float32) ) (10): TransformerEncoderLayer( (self_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (feed_forward): PositionwiseFeedForward( (w_1): Linear(in_features=256, out_features=2048, dtype=float32) (activation): ReLU() (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (w_2): Linear(in_features=2048, out_features=256, dtype=float32) ) (norm1): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm2): LayerNorm(normalized_shape=[256], epsilon=1e-12) (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (concat_linear): Linear(in_features=512, out_features=256, dtype=float32) ) (11): TransformerEncoderLayer( (self_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (feed_forward): PositionwiseFeedForward( (w_1): Linear(in_features=256, out_features=2048, dtype=float32) (activation): ReLU() (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (w_2): Linear(in_features=2048, out_features=256, dtype=float32) ) (norm1): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm2): LayerNorm(normalized_shape=[256], epsilon=1e-12) (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (concat_linear): Linear(in_features=512, out_features=256, dtype=float32) ) ) ) (decoder): TransformerDecoder( (embed): Sequential( (0): Embedding(4233, 256, sparse=False) (1): PositionalEncoding( (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) ) ) (after_norm): LayerNorm(normalized_shape=[256], epsilon=1e-12) (output_layer): Linear(in_features=256, out_features=4233, dtype=float32) (decoders): LayerList( (0): DecoderLayer( (self_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (src_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (feed_forward): PositionwiseFeedForward( (w_1): Linear(in_features=256, out_features=2048, dtype=float32) (activation): ReLU() (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (w_2): Linear(in_features=2048, out_features=256, dtype=float32) ) (norm1): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm2): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm3): LayerNorm(normalized_shape=[256], epsilon=1e-12) (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (concat_linear1): Linear(in_features=512, out_features=256, dtype=float32) (concat_linear2): Linear(in_features=512, out_features=256, dtype=float32) ) (1): DecoderLayer( (self_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (src_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (feed_forward): PositionwiseFeedForward( (w_1): Linear(in_features=256, out_features=2048, dtype=float32) (activation): ReLU() (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (w_2): Linear(in_features=2048, out_features=256, dtype=float32) ) (norm1): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm2): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm3): LayerNorm(normalized_shape=[256], epsilon=1e-12) (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (concat_linear1): Linear(in_features=512, out_features=256, dtype=float32) (concat_linear2): Linear(in_features=512, out_features=256, dtype=float32) ) (2): DecoderLayer( (self_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (src_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (feed_forward): PositionwiseFeedForward( (w_1): Linear(in_features=256, out_features=2048, dtype=float32) (activation): ReLU() (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (w_2): Linear(in_features=2048, out_features=256, dtype=float32) ) (norm1): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm2): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm3): LayerNorm(normalized_shape=[256], epsilon=1e-12) (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (concat_linear1): Linear(in_features=512, out_features=256, dtype=float32) (concat_linear2): Linear(in_features=512, out_features=256, dtype=float32) ) (3): DecoderLayer( (self_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (src_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (feed_forward): PositionwiseFeedForward( (w_1): Linear(in_features=256, out_features=2048, dtype=float32) (activation): ReLU() (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (w_2): Linear(in_features=2048, out_features=256, dtype=float32) ) (norm1): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm2): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm3): LayerNorm(normalized_shape=[256], epsilon=1e-12) (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (concat_linear1): Linear(in_features=512, out_features=256, dtype=float32) (concat_linear2): Linear(in_features=512, out_features=256, dtype=float32) ) (4): DecoderLayer( (self_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (src_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (feed_forward): PositionwiseFeedForward( (w_1): Linear(in_features=256, out_features=2048, dtype=float32) (activation): ReLU() (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (w_2): Linear(in_features=2048, out_features=256, dtype=float32) ) (norm1): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm2): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm3): LayerNorm(normalized_shape=[256], epsilon=1e-12) (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (concat_linear1): Linear(in_features=512, out_features=256, dtype=float32) (concat_linear2): Linear(in_features=512, out_features=256, dtype=float32) ) (5): DecoderLayer( (self_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (src_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (feed_forward): PositionwiseFeedForward( (w_1): Linear(in_features=256, out_features=2048, dtype=float32) (activation): ReLU() (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (w_2): Linear(in_features=2048, out_features=256, dtype=float32) ) (norm1): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm2): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm3): LayerNorm(normalized_shape=[256], epsilon=1e-12) (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (concat_linear1): Linear(in_features=512, out_features=256, dtype=float32) (concat_linear2): Linear(in_features=512, out_features=256, dtype=float32) ) ) ) (ctc): CTCDecoderBase( (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) (ctc_lo): Linear(in_features=256, out_features=4233, dtype=float32) (criterion): CTCLoss( (loss): CTCLoss() ) ) (criterion_att): LabelSmoothingLoss( (criterion): KLDivLoss() ) ) 2022-12-12 18:37:33.830 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.embed.conv.0.weight | [256, 1, 3, 3] | 2304 | True 2022-12-12 18:37:33.830 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.embed.conv.0.bias | [256] | 256 | True 2022-12-12 18:37:33.830 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.embed.conv.2.weight | [256, 256, 3, 3] | 589824 | True 2022-12-12 18:37:33.831 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.embed.conv.2.bias | [256] | 256 | True 2022-12-12 18:37:33.831 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.embed.out.0.weight | [4864, 256] | 1245184 | True 2022-12-12 18:37:33.831 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.embed.out.0.bias | [256] | 256 | True 2022-12-12 18:37:33.831 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.after_norm.weight | [256] | 256 | True 2022-12-12 18:37:33.832 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.after_norm.bias | [256] | 256 | True 2022-12-12 18:37:33.832 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.0.self_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.832 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.0.self_attn.linear_q.bias | [256] | 256 | True 2022-12-12 18:37:33.833 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.0.self_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.833 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.0.self_attn.linear_k.bias | [256] | 256 | True 2022-12-12 18:37:33.833 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.0.self_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.834 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.0.self_attn.linear_v.bias | [256] | 256 | True 2022-12-12 18:37:33.834 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.0.self_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.834 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.0.self_attn.linear_out.bias | [256] | 256 | True 2022-12-12 18:37:33.835 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.0.feed_forward.w_1.weight | [256, 2048] | 524288 | True 2022-12-12 18:37:33.835 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.0.feed_forward.w_1.bias | [2048] | 2048 | True 2022-12-12 18:37:33.835 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.0.feed_forward.w_2.weight | [2048, 256] | 524288 | True 2022-12-12 18:37:33.835 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.0.feed_forward.w_2.bias | [256] | 256 | True 2022-12-12 18:37:33.836 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.0.norm1.weight | [256] | 256 | True 2022-12-12 18:37:33.836 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.0.norm1.bias | [256] | 256 | True 2022-12-12 18:37:33.837 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.0.norm2.weight | [256] | 256 | True 2022-12-12 18:37:33.837 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.0.norm2.bias | [256] | 256 | True 2022-12-12 18:37:33.837 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.0.concat_linear.weight | [512, 256] | 131072 | True 2022-12-12 18:37:33.837 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.0.concat_linear.bias | [256] | 256 | True 2022-12-12 18:37:33.838 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.1.self_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.838 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.1.self_attn.linear_q.bias | [256] | 256 | True 2022-12-12 18:37:33.838 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.1.self_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.839 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.1.self_attn.linear_k.bias | [256] | 256 | True 2022-12-12 18:37:33.839 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.1.self_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.839 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.1.self_attn.linear_v.bias | [256] | 256 | True 2022-12-12 18:37:33.840 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.1.self_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.840 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.1.self_attn.linear_out.bias | [256] | 256 | True 2022-12-12 18:37:33.840 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.1.feed_forward.w_1.weight | [256, 2048] | 524288 | True 2022-12-12 18:37:33.841 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.1.feed_forward.w_1.bias | [2048] | 2048 | True 2022-12-12 18:37:33.841 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.1.feed_forward.w_2.weight | [2048, 256] | 524288 | True 2022-12-12 18:37:33.841 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.1.feed_forward.w_2.bias | [256] | 256 | True 2022-12-12 18:37:33.841 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.1.norm1.weight | [256] | 256 | True 2022-12-12 18:37:33.842 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.1.norm1.bias | [256] | 256 | True 2022-12-12 18:37:33.842 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.1.norm2.weight | [256] | 256 | True 2022-12-12 18:37:33.842 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.1.norm2.bias | [256] | 256 | True 2022-12-12 18:37:33.843 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.1.concat_linear.weight | [512, 256] | 131072 | True 2022-12-12 18:37:33.843 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.1.concat_linear.bias | [256] | 256 | True 2022-12-12 18:37:33.843 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.2.self_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.844 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.2.self_attn.linear_q.bias | [256] | 256 | True 2022-12-12 18:37:33.844 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.2.self_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.844 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.2.self_attn.linear_k.bias | [256] | 256 | True 2022-12-12 18:37:33.845 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.2.self_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.845 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.2.self_attn.linear_v.bias | [256] | 256 | True 2022-12-12 18:37:33.846 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.2.self_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.846 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.2.self_attn.linear_out.bias | [256] | 256 | True 2022-12-12 18:37:33.846 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.2.feed_forward.w_1.weight | [256, 2048] | 524288 | True 2022-12-12 18:37:33.846 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.2.feed_forward.w_1.bias | [2048] | 2048 | True 2022-12-12 18:37:33.847 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.2.feed_forward.w_2.weight | [2048, 256] | 524288 | True 2022-12-12 18:37:33.847 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.2.feed_forward.w_2.bias | [256] | 256 | True 2022-12-12 18:37:33.847 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.2.norm1.weight | [256] | 256 | True 2022-12-12 18:37:33.848 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.2.norm1.bias | [256] | 256 | True 2022-12-12 18:37:33.848 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.2.norm2.weight | [256] | 256 | True 2022-12-12 18:37:33.848 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.2.norm2.bias | [256] | 256 | True 2022-12-12 18:37:33.849 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.2.concat_linear.weight | [512, 256] | 131072 | True 2022-12-12 18:37:33.849 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.2.concat_linear.bias | [256] | 256 | True 2022-12-12 18:37:33.849 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.3.self_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.849 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.3.self_attn.linear_q.bias | [256] | 256 | True 2022-12-12 18:37:33.850 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.3.self_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.850 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.3.self_attn.linear_k.bias | [256] | 256 | True 2022-12-12 18:37:33.850 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.3.self_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.851 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.3.self_attn.linear_v.bias | [256] | 256 | True 2022-12-12 18:37:33.851 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.3.self_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.851 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.3.self_attn.linear_out.bias | [256] | 256 | True 2022-12-12 18:37:33.851 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.3.feed_forward.w_1.weight | [256, 2048] | 524288 | True 2022-12-12 18:37:33.852 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.3.feed_forward.w_1.bias | [2048] | 2048 | True 2022-12-12 18:37:33.852 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.3.feed_forward.w_2.weight | [2048, 256] | 524288 | True 2022-12-12 18:37:33.852 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.3.feed_forward.w_2.bias | [256] | 256 | True 2022-12-12 18:37:33.853 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.3.norm1.weight | [256] | 256 | True 2022-12-12 18:37:33.853 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.3.norm1.bias | [256] | 256 | True 2022-12-12 18:37:33.853 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.3.norm2.weight | [256] | 256 | True 2022-12-12 18:37:33.854 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.3.norm2.bias | [256] | 256 | True 2022-12-12 18:37:33.854 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.3.concat_linear.weight | [512, 256] | 131072 | True 2022-12-12 18:37:33.854 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.3.concat_linear.bias | [256] | 256 | True 2022-12-12 18:37:33.854 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.4.self_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.855 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.4.self_attn.linear_q.bias | [256] | 256 | True 2022-12-12 18:37:33.855 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.4.self_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.856 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.4.self_attn.linear_k.bias | [256] | 256 | True 2022-12-12 18:37:33.856 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.4.self_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.856 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.4.self_attn.linear_v.bias | [256] | 256 | True 2022-12-12 18:37:33.857 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.4.self_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.857 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.4.self_attn.linear_out.bias | [256] | 256 | True 2022-12-12 18:37:33.857 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.4.feed_forward.w_1.weight | [256, 2048] | 524288 | True 2022-12-12 18:37:33.858 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.4.feed_forward.w_1.bias | [2048] | 2048 | True 2022-12-12 18:37:33.858 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.4.feed_forward.w_2.weight | [2048, 256] | 524288 | True 2022-12-12 18:37:33.858 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.4.feed_forward.w_2.bias | [256] | 256 | True 2022-12-12 18:37:33.859 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.4.norm1.weight | [256] | 256 | True 2022-12-12 18:37:33.859 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.4.norm1.bias | [256] | 256 | True 2022-12-12 18:37:33.860 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.4.norm2.weight | [256] | 256 | True 2022-12-12 18:37:33.860 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.4.norm2.bias | [256] | 256 | True 2022-12-12 18:37:33.860 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.4.concat_linear.weight | [512, 256] | 131072 | True 2022-12-12 18:37:33.861 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.4.concat_linear.bias | [256] | 256 | True 2022-12-12 18:37:33.861 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.5.self_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.861 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.5.self_attn.linear_q.bias | [256] | 256 | True 2022-12-12 18:37:33.862 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.5.self_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.862 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.5.self_attn.linear_k.bias | [256] | 256 | True 2022-12-12 18:37:33.863 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.5.self_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.863 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.5.self_attn.linear_v.bias | [256] | 256 | True 2022-12-12 18:37:33.863 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.5.self_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.864 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.5.self_attn.linear_out.bias | [256] | 256 | True 2022-12-12 18:37:33.864 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.5.feed_forward.w_1.weight | [256, 2048] | 524288 | True 2022-12-12 18:37:33.864 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.5.feed_forward.w_1.bias | [2048] | 2048 | True 2022-12-12 18:37:33.865 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.5.feed_forward.w_2.weight | [2048, 256] | 524288 | True 2022-12-12 18:37:33.865 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.5.feed_forward.w_2.bias | [256] | 256 | True 2022-12-12 18:37:33.866 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.5.norm1.weight | [256] | 256 | True 2022-12-12 18:37:33.866 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.5.norm1.bias | [256] | 256 | True 2022-12-12 18:37:33.866 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.5.norm2.weight | [256] | 256 | True 2022-12-12 18:37:33.867 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.5.norm2.bias | [256] | 256 | True 2022-12-12 18:37:33.867 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.5.concat_linear.weight | [512, 256] | 131072 | True 2022-12-12 18:37:33.867 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.5.concat_linear.bias | [256] | 256 | True 2022-12-12 18:37:33.868 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.6.self_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.868 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.6.self_attn.linear_q.bias | [256] | 256 | True 2022-12-12 18:37:33.869 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.6.self_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.869 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.6.self_attn.linear_k.bias | [256] | 256 | True 2022-12-12 18:37:33.869 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.6.self_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.870 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.6.self_attn.linear_v.bias | [256] | 256 | True 2022-12-12 18:37:33.870 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.6.self_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.870 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.6.self_attn.linear_out.bias | [256] | 256 | True 2022-12-12 18:37:33.871 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.6.feed_forward.w_1.weight | [256, 2048] | 524288 | True 2022-12-12 18:37:33.871 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.6.feed_forward.w_1.bias | [2048] | 2048 | True 2022-12-12 18:37:33.871 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.6.feed_forward.w_2.weight | [2048, 256] | 524288 | True 2022-12-12 18:37:33.872 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.6.feed_forward.w_2.bias | [256] | 256 | True 2022-12-12 18:37:33.872 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.6.norm1.weight | [256] | 256 | True 2022-12-12 18:37:33.873 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.6.norm1.bias | [256] | 256 | True 2022-12-12 18:37:33.873 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.6.norm2.weight | [256] | 256 | True 2022-12-12 18:37:33.873 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.6.norm2.bias | [256] | 256 | True 2022-12-12 18:37:33.874 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.6.concat_linear.weight | [512, 256] | 131072 | True 2022-12-12 18:37:33.874 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.6.concat_linear.bias | [256] | 256 | True 2022-12-12 18:37:33.874 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.7.self_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.875 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.7.self_attn.linear_q.bias | [256] | 256 | True 2022-12-12 18:37:33.875 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.7.self_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.876 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.7.self_attn.linear_k.bias | [256] | 256 | True 2022-12-12 18:37:33.876 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.7.self_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.876 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.7.self_attn.linear_v.bias | [256] | 256 | True 2022-12-12 18:37:33.877 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.7.self_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.877 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.7.self_attn.linear_out.bias | [256] | 256 | True 2022-12-12 18:37:33.877 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.7.feed_forward.w_1.weight | [256, 2048] | 524288 | True 2022-12-12 18:37:33.878 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.7.feed_forward.w_1.bias | [2048] | 2048 | True 2022-12-12 18:37:33.878 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.7.feed_forward.w_2.weight | [2048, 256] | 524288 | True 2022-12-12 18:37:33.879 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.7.feed_forward.w_2.bias | [256] | 256 | True 2022-12-12 18:37:33.879 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.7.norm1.weight | [256] | 256 | True 2022-12-12 18:37:33.879 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.7.norm1.bias | [256] | 256 | True 2022-12-12 18:37:33.880 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.7.norm2.weight | [256] | 256 | True 2022-12-12 18:37:33.880 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.7.norm2.bias | [256] | 256 | True 2022-12-12 18:37:33.880 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.7.concat_linear.weight | [512, 256] | 131072 | True 2022-12-12 18:37:33.881 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.7.concat_linear.bias | [256] | 256 | True 2022-12-12 18:37:33.881 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.8.self_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.882 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.8.self_attn.linear_q.bias | [256] | 256 | True 2022-12-12 18:37:33.882 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.8.self_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.882 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.8.self_attn.linear_k.bias | [256] | 256 | True 2022-12-12 18:37:33.882 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.8.self_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.883 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.8.self_attn.linear_v.bias | [256] | 256 | True 2022-12-12 18:37:33.883 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.8.self_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.884 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.8.self_attn.linear_out.bias | [256] | 256 | True 2022-12-12 18:37:33.884 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.8.feed_forward.w_1.weight | [256, 2048] | 524288 | True 2022-12-12 18:37:33.884 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.8.feed_forward.w_1.bias | [2048] | 2048 | True 2022-12-12 18:37:33.885 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.8.feed_forward.w_2.weight | [2048, 256] | 524288 | True 2022-12-12 18:37:33.885 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.8.feed_forward.w_2.bias | [256] | 256 | True 2022-12-12 18:37:33.885 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.8.norm1.weight | [256] | 256 | True 2022-12-12 18:37:33.886 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.8.norm1.bias | [256] | 256 | True 2022-12-12 18:37:33.886 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.8.norm2.weight | [256] | 256 | True 2022-12-12 18:37:33.886 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.8.norm2.bias | [256] | 256 | True 2022-12-12 18:37:33.887 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.8.concat_linear.weight | [512, 256] | 131072 | True 2022-12-12 18:37:33.887 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.8.concat_linear.bias | [256] | 256 | True 2022-12-12 18:37:33.888 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.9.self_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.888 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.9.self_attn.linear_q.bias | [256] | 256 | True 2022-12-12 18:37:33.888 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.9.self_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.889 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.9.self_attn.linear_k.bias | [256] | 256 | True 2022-12-12 18:37:33.889 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.9.self_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.889 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.9.self_attn.linear_v.bias | [256] | 256 | True 2022-12-12 18:37:33.890 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.9.self_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.890 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.9.self_attn.linear_out.bias | [256] | 256 | True 2022-12-12 18:37:33.890 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.9.feed_forward.w_1.weight | [256, 2048] | 524288 | True 2022-12-12 18:37:33.891 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.9.feed_forward.w_1.bias | [2048] | 2048 | True 2022-12-12 18:37:33.891 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.9.feed_forward.w_2.weight | [2048, 256] | 524288 | True 2022-12-12 18:37:33.892 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.9.feed_forward.w_2.bias | [256] | 256 | True 2022-12-12 18:37:33.892 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.9.norm1.weight | [256] | 256 | True 2022-12-12 18:37:33.892 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.9.norm1.bias | [256] | 256 | True 2022-12-12 18:37:33.893 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.9.norm2.weight | [256] | 256 | True 2022-12-12 18:37:33.893 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.9.norm2.bias | [256] | 256 | True 2022-12-12 18:37:33.893 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.9.concat_linear.weight | [512, 256] | 131072 | True 2022-12-12 18:37:33.894 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.9.concat_linear.bias | [256] | 256 | True 2022-12-12 18:37:33.894 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.10.self_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.894 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.10.self_attn.linear_q.bias | [256] | 256 | True 2022-12-12 18:37:33.895 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.10.self_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.895 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.10.self_attn.linear_k.bias | [256] | 256 | True 2022-12-12 18:37:33.895 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.10.self_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.896 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.10.self_attn.linear_v.bias | [256] | 256 | True 2022-12-12 18:37:33.896 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.10.self_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.897 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.10.self_attn.linear_out.bias | [256] | 256 | True 2022-12-12 18:37:33.897 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.10.feed_forward.w_1.weight | [256, 2048] | 524288 | True 2022-12-12 18:37:33.897 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.10.feed_forward.w_1.bias | [2048] | 2048 | True 2022-12-12 18:37:33.898 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.10.feed_forward.w_2.weight | [2048, 256] | 524288 | True 2022-12-12 18:37:33.898 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.10.feed_forward.w_2.bias | [256] | 256 | True 2022-12-12 18:37:33.898 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.10.norm1.weight | [256] | 256 | True 2022-12-12 18:37:33.899 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.10.norm1.bias | [256] | 256 | True 2022-12-12 18:37:33.899 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.10.norm2.weight | [256] | 256 | True 2022-12-12 18:37:33.899 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.10.norm2.bias | [256] | 256 | True 2022-12-12 18:37:33.900 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.10.concat_linear.weight | [512, 256] | 131072 | True 2022-12-12 18:37:33.900 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.10.concat_linear.bias | [256] | 256 | True 2022-12-12 18:37:33.900 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.11.self_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.901 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.11.self_attn.linear_q.bias | [256] | 256 | True 2022-12-12 18:37:33.901 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.11.self_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.902 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.11.self_attn.linear_k.bias | [256] | 256 | True 2022-12-12 18:37:33.902 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.11.self_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.902 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.11.self_attn.linear_v.bias | [256] | 256 | True 2022-12-12 18:37:33.902 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.11.self_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.903 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.11.self_attn.linear_out.bias | [256] | 256 | True 2022-12-12 18:37:33.903 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.11.feed_forward.w_1.weight | [256, 2048] | 524288 | True 2022-12-12 18:37:33.904 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.11.feed_forward.w_1.bias | [2048] | 2048 | True 2022-12-12 18:37:33.904 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.11.feed_forward.w_2.weight | [2048, 256] | 524288 | True 2022-12-12 18:37:33.904 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.11.feed_forward.w_2.bias | [256] | 256 | True 2022-12-12 18:37:33.905 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.11.norm1.weight | [256] | 256 | True 2022-12-12 18:37:33.905 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.11.norm1.bias | [256] | 256 | True 2022-12-12 18:37:33.905 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.11.norm2.weight | [256] | 256 | True 2022-12-12 18:37:33.906 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.11.norm2.bias | [256] | 256 | True 2022-12-12 18:37:33.906 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.11.concat_linear.weight | [512, 256] | 131072 | True 2022-12-12 18:37:33.906 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.11.concat_linear.bias | [256] | 256 | True 2022-12-12 18:37:33.906 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.embed.0.weight | [4233, 256] | 1083648 | True 2022-12-12 18:37:33.907 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.after_norm.weight | [256] | 256 | True 2022-12-12 18:37:33.907 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.after_norm.bias | [256] | 256 | True 2022-12-12 18:37:33.908 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.output_layer.weight | [256, 4233] | 1083648 | True 2022-12-12 18:37:33.908 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.output_layer.bias | [4233] | 4233 | True 2022-12-12 18:37:33.908 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.self_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.909 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.self_attn.linear_q.bias | [256] | 256 | True 2022-12-12 18:37:33.909 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.self_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.909 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.self_attn.linear_k.bias | [256] | 256 | True 2022-12-12 18:37:33.910 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.self_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.910 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.self_attn.linear_v.bias | [256] | 256 | True 2022-12-12 18:37:33.910 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.self_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.911 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.self_attn.linear_out.bias | [256] | 256 | True 2022-12-12 18:37:33.911 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.src_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.911 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.src_attn.linear_q.bias | [256] | 256 | True 2022-12-12 18:37:33.912 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.src_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.912 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.src_attn.linear_k.bias | [256] | 256 | True 2022-12-12 18:37:33.913 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.src_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.913 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.src_attn.linear_v.bias | [256] | 256 | True 2022-12-12 18:37:33.913 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.src_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.913 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.src_attn.linear_out.bias | [256] | 256 | True 2022-12-12 18:37:33.914 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.feed_forward.w_1.weight | [256, 2048] | 524288 | True 2022-12-12 18:37:33.914 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.feed_forward.w_1.bias | [2048] | 2048 | True 2022-12-12 18:37:33.915 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.feed_forward.w_2.weight | [2048, 256] | 524288 | True 2022-12-12 18:37:33.915 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.feed_forward.w_2.bias | [256] | 256 | True 2022-12-12 18:37:33.916 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.norm1.weight | [256] | 256 | True 2022-12-12 18:37:33.916 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.norm1.bias | [256] | 256 | True 2022-12-12 18:37:33.916 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.norm2.weight | [256] | 256 | True 2022-12-12 18:37:33.916 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.norm2.bias | [256] | 256 | True 2022-12-12 18:37:33.917 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.norm3.weight | [256] | 256 | True 2022-12-12 18:37:33.917 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.norm3.bias | [256] | 256 | True 2022-12-12 18:37:33.918 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.concat_linear1.weight | [512, 256] | 131072 | True 2022-12-12 18:37:33.918 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.concat_linear1.bias | [256] | 256 | True 2022-12-12 18:37:33.919 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.concat_linear2.weight | [512, 256] | 131072 | True 2022-12-12 18:37:33.919 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.concat_linear2.bias | [256] | 256 | True 2022-12-12 18:37:33.919 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.self_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.919 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.self_attn.linear_q.bias | [256] | 256 | True 2022-12-12 18:37:33.920 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.self_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.920 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.self_attn.linear_k.bias | [256] | 256 | True 2022-12-12 18:37:33.920 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.self_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.921 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.self_attn.linear_v.bias | [256] | 256 | True 2022-12-12 18:37:33.921 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.self_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.922 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.self_attn.linear_out.bias | [256] | 256 | True 2022-12-12 18:37:33.922 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.src_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.922 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.src_attn.linear_q.bias | [256] | 256 | True 2022-12-12 18:37:33.923 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.src_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.923 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.src_attn.linear_k.bias | [256] | 256 | True 2022-12-12 18:37:33.923 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.src_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.924 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.src_attn.linear_v.bias | [256] | 256 | True 2022-12-12 18:37:33.924 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.src_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.924 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.src_attn.linear_out.bias | [256] | 256 | True 2022-12-12 18:37:33.925 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.feed_forward.w_1.weight | [256, 2048] | 524288 | True 2022-12-12 18:37:33.925 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.feed_forward.w_1.bias | [2048] | 2048 | True 2022-12-12 18:37:33.926 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.feed_forward.w_2.weight | [2048, 256] | 524288 | True 2022-12-12 18:37:33.926 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.feed_forward.w_2.bias | [256] | 256 | True 2022-12-12 18:37:33.926 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.norm1.weight | [256] | 256 | True 2022-12-12 18:37:33.926 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.norm1.bias | [256] | 256 | True 2022-12-12 18:37:33.927 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.norm2.weight | [256] | 256 | True 2022-12-12 18:37:33.927 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.norm2.bias | [256] | 256 | True 2022-12-12 18:37:33.928 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.norm3.weight | [256] | 256 | True 2022-12-12 18:37:33.928 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.norm3.bias | [256] | 256 | True 2022-12-12 18:37:33.928 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.concat_linear1.weight | [512, 256] | 131072 | True 2022-12-12 18:37:33.929 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.concat_linear1.bias | [256] | 256 | True 2022-12-12 18:37:33.929 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.concat_linear2.weight | [512, 256] | 131072 | True 2022-12-12 18:37:33.929 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.concat_linear2.bias | [256] | 256 | True 2022-12-12 18:37:33.930 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.self_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.930 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.self_attn.linear_q.bias | [256] | 256 | True 2022-12-12 18:37:33.930 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.self_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.931 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.self_attn.linear_k.bias | [256] | 256 | True 2022-12-12 18:37:33.931 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.self_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.931 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.self_attn.linear_v.bias | [256] | 256 | True 2022-12-12 18:37:33.932 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.self_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.932 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.self_attn.linear_out.bias | [256] | 256 | True 2022-12-12 18:37:33.933 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.src_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.933 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.src_attn.linear_q.bias | [256] | 256 | True 2022-12-12 18:37:33.933 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.src_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.933 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.src_attn.linear_k.bias | [256] | 256 | True 2022-12-12 18:37:33.934 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.src_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.934 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.src_attn.linear_v.bias | [256] | 256 | True 2022-12-12 18:37:33.935 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.src_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.935 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.src_attn.linear_out.bias | [256] | 256 | True 2022-12-12 18:37:33.935 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.feed_forward.w_1.weight | [256, 2048] | 524288 | True 2022-12-12 18:37:33.936 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.feed_forward.w_1.bias | [2048] | 2048 | True 2022-12-12 18:37:33.936 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.feed_forward.w_2.weight | [2048, 256] | 524288 | True 2022-12-12 18:37:33.936 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.feed_forward.w_2.bias | [256] | 256 | True 2022-12-12 18:37:33.937 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.norm1.weight | [256] | 256 | True 2022-12-12 18:37:33.937 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.norm1.bias | [256] | 256 | True 2022-12-12 18:37:33.938 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.norm2.weight | [256] | 256 | True 2022-12-12 18:37:33.938 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.norm2.bias | [256] | 256 | True 2022-12-12 18:37:33.938 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.norm3.weight | [256] | 256 | True 2022-12-12 18:37:33.938 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.norm3.bias | [256] | 256 | True 2022-12-12 18:37:33.939 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.concat_linear1.weight | [512, 256] | 131072 | True 2022-12-12 18:37:33.939 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.concat_linear1.bias | [256] | 256 | True 2022-12-12 18:37:33.940 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.concat_linear2.weight | [512, 256] | 131072 | True 2022-12-12 18:37:33.940 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.concat_linear2.bias | [256] | 256 | True 2022-12-12 18:37:33.940 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.self_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.940 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.self_attn.linear_q.bias | [256] | 256 | True 2022-12-12 18:37:33.941 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.self_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.941 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.self_attn.linear_k.bias | [256] | 256 | True 2022-12-12 18:37:33.942 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.self_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.942 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.self_attn.linear_v.bias | [256] | 256 | True 2022-12-12 18:37:33.942 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.self_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.943 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.self_attn.linear_out.bias | [256] | 256 | True 2022-12-12 18:37:33.943 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.src_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.943 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.src_attn.linear_q.bias | [256] | 256 | True 2022-12-12 18:37:33.944 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.src_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.944 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.src_attn.linear_k.bias | [256] | 256 | True 2022-12-12 18:37:33.944 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.src_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.945 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.src_attn.linear_v.bias | [256] | 256 | True 2022-12-12 18:37:33.945 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.src_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.945 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.src_attn.linear_out.bias | [256] | 256 | True 2022-12-12 18:37:33.946 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.feed_forward.w_1.weight | [256, 2048] | 524288 | True 2022-12-12 18:37:33.946 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.feed_forward.w_1.bias | [2048] | 2048 | True 2022-12-12 18:37:33.947 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.feed_forward.w_2.weight | [2048, 256] | 524288 | True 2022-12-12 18:37:33.947 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.feed_forward.w_2.bias | [256] | 256 | True 2022-12-12 18:37:33.947 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.norm1.weight | [256] | 256 | True 2022-12-12 18:37:33.947 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.norm1.bias | [256] | 256 | True 2022-12-12 18:37:33.948 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.norm2.weight | [256] | 256 | True 2022-12-12 18:37:33.948 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.norm2.bias | [256] | 256 | True 2022-12-12 18:37:33.949 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.norm3.weight | [256] | 256 | True 2022-12-12 18:37:33.949 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.norm3.bias | [256] | 256 | True 2022-12-12 18:37:33.949 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.concat_linear1.weight | [512, 256] | 131072 | True 2022-12-12 18:37:33.950 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.concat_linear1.bias | [256] | 256 | True 2022-12-12 18:37:33.950 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.concat_linear2.weight | [512, 256] | 131072 | True 2022-12-12 18:37:33.950 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.concat_linear2.bias | [256] | 256 | True 2022-12-12 18:37:33.951 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.self_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.951 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.self_attn.linear_q.bias | [256] | 256 | True 2022-12-12 18:37:33.951 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.self_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.952 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.self_attn.linear_k.bias | [256] | 256 | True 2022-12-12 18:37:33.952 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.self_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.952 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.self_attn.linear_v.bias | [256] | 256 | True 2022-12-12 18:37:33.953 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.self_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.953 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.self_attn.linear_out.bias | [256] | 256 | True 2022-12-12 18:37:33.954 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.src_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.954 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.src_attn.linear_q.bias | [256] | 256 | True 2022-12-12 18:37:33.954 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.src_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.954 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.src_attn.linear_k.bias | [256] | 256 | True 2022-12-12 18:37:33.955 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.src_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.955 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.src_attn.linear_v.bias | [256] | 256 | True 2022-12-12 18:37:33.956 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.src_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.956 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.src_attn.linear_out.bias | [256] | 256 | True 2022-12-12 18:37:33.956 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.feed_forward.w_1.weight | [256, 2048] | 524288 | True 2022-12-12 18:37:33.957 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.feed_forward.w_1.bias | [2048] | 2048 | True 2022-12-12 18:37:33.957 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.feed_forward.w_2.weight | [2048, 256] | 524288 | True 2022-12-12 18:37:33.957 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.feed_forward.w_2.bias | [256] | 256 | True 2022-12-12 18:37:33.958 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.norm1.weight | [256] | 256 | True 2022-12-12 18:37:33.959 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.norm1.bias | [256] | 256 | True 2022-12-12 18:37:33.959 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.norm2.weight | [256] | 256 | True 2022-12-12 18:37:33.959 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.norm2.bias | [256] | 256 | True 2022-12-12 18:37:33.960 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.norm3.weight | [256] | 256 | True 2022-12-12 18:37:33.960 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.norm3.bias | [256] | 256 | True 2022-12-12 18:37:33.960 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.concat_linear1.weight | [512, 256] | 131072 | True 2022-12-12 18:37:33.961 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.concat_linear1.bias | [256] | 256 | True 2022-12-12 18:37:33.961 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.concat_linear2.weight | [512, 256] | 131072 | True 2022-12-12 18:37:33.961 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.concat_linear2.bias | [256] | 256 | True 2022-12-12 18:37:33.961 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.self_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.962 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.self_attn.linear_q.bias | [256] | 256 | True 2022-12-12 18:37:33.962 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.self_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.963 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.self_attn.linear_k.bias | [256] | 256 | True 2022-12-12 18:37:33.963 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.self_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.963 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.self_attn.linear_v.bias | [256] | 256 | True 2022-12-12 18:37:33.964 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.self_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.964 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.self_attn.linear_out.bias | [256] | 256 | True 2022-12-12 18:37:33.964 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.src_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.965 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.src_attn.linear_q.bias | [256] | 256 | True 2022-12-12 18:37:33.965 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.src_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.965 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.src_attn.linear_k.bias | [256] | 256 | True 2022-12-12 18:37:33.966 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.src_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.966 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.src_attn.linear_v.bias | [256] | 256 | True 2022-12-12 18:37:33.966 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.src_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-12 18:37:33.967 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.src_attn.linear_out.bias | [256] | 256 | True 2022-12-12 18:37:33.967 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.feed_forward.w_1.weight | [256, 2048] | 524288 | True 2022-12-12 18:37:33.967 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.feed_forward.w_1.bias | [2048] | 2048 | True 2022-12-12 18:37:33.968 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.feed_forward.w_2.weight | [2048, 256] | 524288 | True 2022-12-12 18:37:33.968 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.feed_forward.w_2.bias | [256] | 256 | True 2022-12-12 18:37:33.968 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.norm1.weight | [256] | 256 | True 2022-12-12 18:37:33.969 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.norm1.bias | [256] | 256 | True 2022-12-12 18:37:33.969 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.norm2.weight | [256] | 256 | True 2022-12-12 18:37:33.969 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.norm2.bias | [256] | 256 | True 2022-12-12 18:37:33.970 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.norm3.weight | [256] | 256 | True 2022-12-12 18:37:33.970 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.norm3.bias | [256] | 256 | True 2022-12-12 18:37:33.970 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.concat_linear1.weight | [512, 256] | 131072 | True 2022-12-12 18:37:33.971 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.concat_linear1.bias | [256] | 256 | True 2022-12-12 18:37:33.971 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.concat_linear2.weight | [512, 256] | 131072 | True 2022-12-12 18:37:33.971 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.concat_linear2.bias | [256] | 256 | True 2022-12-12 18:37:33.972 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - ctc.ctc_lo.weight | [256, 4233] | 1083648 | True 2022-12-12 18:37:33.972 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - ctc.ctc_lo.bias | [4233] | 4233 | True 2022-12-12 18:37:33.972 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:60 - Total parameters: 411.0, 31.95M elements. 2022-12-12 18:37:33.973 | INFO | paddlespeech.s2t.exps.u2.model:setup_model:262 - Setup model! 2022-12-12 18:37:33.974 | INFO | paddlespeech.s2t.utils.dynamic_import:instance_class:67 - Instance: WarmupLR {'learning_rate': 0.002, 'verbose': False, 'warmup_steps': 25000}. 2022-12-12 18:37:34.050 | INFO | paddlespeech.s2t.training.optimizer:from_args:109 - <WeightDecay - L2Decay, regularization_coeff=0.000001> 2022-12-12 18:37:34.050 | INFO | paddlespeech.s2t.training.optimizer:from_args:111 - <GradClip - Gradient Clip By GlobalNorm, global_norm=5.000000> 2022-12-12 18:37:34.050 | INFO | paddlespeech.s2t.utils.dynamic_import:instance_class:67 - Instance: Adam {'grad_clip': ClipGradByGlobalNormWithLog(global_clip_norm=5.0), 'weight_decay': <paddle.regularizer.L2Decay object at 0x7f9ebf701f70>, 'learning_rate': WarmupLR(warmup_steps=25000, lr=0.002, last_epoch=0)}. 2022-12-12 18:37:34.052 | INFO | paddlespeech.s2t.training.optimizer:from_args:119 - LR: WarmupLR(warmup_steps=25000, lr=0.002, last_epoch=0) 2022-12-12 18:37:34.052 | INFO | paddlespeech.s2t.exps.u2.model:setup_model:308 - Setup optimizer/lr_scheduler! 2022-12-12 18:37:34.052 | INFO | paddlespeech.s2t.training.trainer:resume_or_scratch:221 - Init from scratch! 2022-12-12 18:37:34.287 | INFO | paddlespeech.s2t.utils.checkpoint:_save_parameters:286 - Saved model to exp/transformer/checkpoints/init.pdparams 2022-12-12 18:37:34.288 | INFO | paddlespeech.s2t.utils.checkpoint:_save_parameters:292 - Saved optimzier state to exp/transformer/checkpoints/init.pdopt 2022-12-12 18:37:34.288 | INFO | paddlespeech.s2t.exps.u2.model:do_train:160 - Train Total Examples: 1877 =============================以下是報錯================================== /home/navy/PaddleSpeech/paddlespeech/audio/transform/spec_augment.py:49: DeprecationWarning: BICUBIC is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BICUBIC instead. Image.BICUBIC) /home/navy/PaddleSpeech/paddlespeech/audio/transform/spec_augment.py:51: DeprecationWarning: BICUBIC is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BICUBIC instead. Image.BICUBIC) /home/navy/PaddleSpeech/paddlespeech/audio/transform/spec_augment.py:49: DeprecationWarning: BICUBIC is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BICUBIC instead. Image.BICUBIC) /home/navy/PaddleSpeech/paddlespeech/audio/transform/spec_augment.py:51: DeprecationWarning: BICUBIC is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BICUBIC instead. Image.BICUBIC)


C++ Traceback (most recent call last):

0 arange_ad_func(paddle::experimental::Tensor const&, paddle::experimental::Tensor const&, paddle::experimental::Tensor const&, paddle::experimental::DataType, phi::Place) 1 paddle::experimental::arange(paddle::experimental::Tensor const&, paddle::experimental::Tensor const&, paddle::experimental::Tensor const&, paddle::experimental::DataType, phi::Place const&) 2 void phi::ArangeKernel<long, phi::GPUContext>(phi::GPUContext const&, phi::DenseTensor const&, phi::DenseTensor const&, phi::DenseTensor const&, phi::DenseTensor*)


Error Message Summary:

FatalError: Erroneous arithmetic operation is detected by the operating system. [TimeInfo: Aborted at 1670841456 (unix time) try "date -d @1670841456" if you are using GNU date ] [SignalInfo: SIGFPE (@0x7f9f5575decd) received by PID 5162 (TID 0x7f9fcb8b4700) from PID 1433788109 ]

LAUNCH INFO 2022-12-12 18:37:36,860 Pod failed INFO 2022-12-12 18:37:36,860 controller.py:109] Pod failed LAUNCH ERROR 2022-12-12 18:37:36,860 Container failed !!! Container rank 0 status failed cmd ['/home/navy/miniconda3/bin/python3', '-u', '/home/navy/PaddleSpeech/paddlespeech/s2t/exps/u2/bin/train.py', '--ngpu', '1', '--seed', '0', '--config', 'conf/transformer.yaml', '--output', 'exp/transformer', '--profiler-options', '', '--benchmark-batch-size', '0', '--benchmark-max-step', '0'] code -8 log log/workerlog.0 env {'LC_PAPER': 'lzh_TW.UTF-8', 'XDG_VTNR': '7', 'LC_ADDRESS': 'lzh_TW.UTF-8', 'XDG_SESSION_ID': 'c2', 'TERM_PROGRAM': 'vscode', 'LC_MONETARY': 'lzh_TW.UTF-8', 'XDG_GREETER_DATA_DIR': '/var/lib/lightdm-data/navy', 'CLUTTER_IM_MODULE': 'xim', 'GIO_LAUNCHED_DESKTOP_FILE_PID': '4148', 'SESSION': 'ubuntu', 'GPG_AGENT_INFO': '/home/navy/.gnupg/S.gpg-agent:0:1', 'TERM': 'xterm-256color', 'XDG_MENU_PREFIX': 'gnome-', 'SHELL': '/bin/bash', 'CONDA_SHLVL': '1', 'QT_LINUX_ACCESSIBILITY_ALWAYS_ON': '1', 'TERM_PROGRAM_VERSION': '1.74.0', 'CONDA_PROMPT_MODIFIER': '(base) ', 'LC_NUMERIC': 'lzh_TW.UTF-8', 'ORIGINAL_XDG_CURRENT_DESKTOP': 'Unity', 'UPSTART_SESSION': 'unix:abstract=/com/ubuntu/upstart-session/1000/1570', 'GTK_MODULES': 'gail:atk-bridge:unity-gtk-module', 'LC_ALL': 'C', 'USER': 'navy', 'PYTHONIOENCODING': 'UTF-8', 'SRILM': '/home/navy/PaddleSpeech/tools/srilm', 'LS_COLORS': 'rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:mi=00:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:.tar=01;31:.tgz=01;31:.arc=01;31:.arj=01;31:.taz=01;31:.lha=01;31:.lz4=01;31:.lzh=01;31:.lzma=01;31:.tlz=01;31:.txz=01;31:.tzo=01;31:.t7z=01;31:.zip=01;31:.z=01;31:.Z=01;31:.dz=01;31:.gz=01;31:.lrz=01;31:.lz=01;31:.lzo=01;31:.xz=01;31:.bz2=01;31:.bz=01;31:.tbz=01;31:.tbz2=01;31:.tz=01;31:.deb=01;31:.rpm=01;31:.jar=01;31:.war=01;31:.ear=01;31:.sar=01;31:.rar=01;31:.alz=01;31:.ace=01;31:.zoo=01;31:.cpio=01;31:.7z=01;31:.rz=01;31:.cab=01;31:.jpg=01;35:.jpeg=01;35:.gif=01;35:.bmp=01;35:.pbm=01;35:.pgm=01;35:.ppm=01;35:.tga=01;35:.xbm=01;35:.xpm=01;35:.tif=01;35:.tiff=01;35:.png=01;35:.svg=01;35:.svgz=01;35:.mng=01;35:.pcx=01;35:.mov=01;35:.mpg=01;35:.mpeg=01;35:.m2v=01;35:.mkv=01;35:.webm=01;35:.ogm=01;35:.mp4=01;35:.m4v=01;35:.mp4v=01;35:.vob=01;35:.qt=01;35:.nuv=01;35:.wmv=01;35:.asf=01;35:.rm=01;35:.rmvb=01;35:.flc=01;35:.avi=01;35:.fli=01;35:.flv=01;35:.gl=01;35:.dl=01;35:.xcf=01;35:.xwd=01;35:.yuv=01;35:.cgm=01;35:.emf=01;35:.ogv=01;35:.ogx=01;35:.aac=00;36:.au=00;36:.flac=00;36:.m4a=00;36:.mid=00;36:.midi=00;36:.mka=00;36:.mp3=00;36:.mpc=00;36:.ogg=00;36:.ra=00;36:.wav=00;36:.oga=00;36:.opus=00;36:.spx=00;36:.xspf=00;36:', 'LD_LIBRARY_PATH': '/usr/local/cuda-10.2/lib64:/usr/local/cuda-10.2/lib64::/usr/local/lib/:/home/navy/PaddleSpeech/tools/liblbfgs-1.10/lib/.libs', 'LC_TELEPHONE': 'lzh_TW.UTF-8', 'QT_ACCESSIBILITY': '1', 'BIN_DIR': '/home/navy/PaddleSpeech/paddlespeech/s2t/exps/u2/bin', 'CONDA_EXE': '/home/navy/miniconda3/bin/conda', 'UNITY_HAS_3D_SUPPORT': 'false', 'XDG_SESSION_PATH': '/org/freedesktop/DisplayManager/Session0', 'XDG_SEAT_PATH': '/org/freedesktop/DisplayManager/Seat0', 'LIBLBFGS': '/home/navy/PaddleSpeech/tools/liblbfgs-1.10', 'SSH_AUTH_SOCK': '/run/user/1000/keyring/ssh', 'MAIN_ROOT': '/home/navy/PaddleSpeech', 'SESSION_MANAGER': 'local/navy:@/tmp/.ICE-unix/1817,unix/navy:/tmp/.ICE-unix/1817', 'DEFAULTS_PATH': '/usr/share/gconf/ubuntu.default.path', 'FLAGS_allocator_strategy': 'naive_best_fit', 'GIO_LAUNCHED_DESKTOP_FILE': '/usr/share/applications/code.desktop', '_CE_CONDA': '', 'UNITY_DEFAULT_PROFILE': 'unity-lowgfx', 'XDG_CONFIG_DIRS': '/etc/xdg/xdg-ubuntu:/usr/share/upstart/xdg:/etc/xdg', 'DESKTOP_SESSION': 'ubuntu', 'PATH': '/home/navy/PaddleSpeech/examples/aishell/asr1/utils/:/home/navy/PaddleSpeech/tools/kaldi/tools/openfst/bin:/home/navy/PaddleSpeech/examples/aishell/asr1:/home/navy/PaddleSpeech:/home/navy/PaddleSpeech/utils:/usr/local/cuda-10.2/bin:/home/navy/bin:/home/navy/.local/bin:/usr/local/cuda-10.2/bin:/home/navy/miniconda3/bin:/home/navy/miniconda3/condabin:/home/navy/bin:/home/navy/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/home/navy/PaddleSpeech/tools/srilm/bin:/home/navy/PaddleSpeech/tools/srilm/bin/i686-m64', 'QT_QPA_PLATFORMTHEME': 'appmenu-qt5', 'QT_IM_MODULE': 'fcitx', 'CONDA_PREFIX': '/home/navy/miniconda3', 'LC_IDENTIFICATION': 'lzh_TW.UTF-8', 'JOB': 'unity-settings-daemon', 'XDG_SESSION_TYPE': 'x11', 'PWD': '/home/navy/PaddleSpeech/examples/aishell/asr1', 'XMODIFIERS': '@im=fcitx', 'CUDA_VISIBLE_DEVICES': '0', 'LANG': 'en_US.UTF-8', 'GDK_BACKEND': 'x11', 'GDM_LANG': 'en_US', 'MANDATORY_PATH': '/usr/share/gconf/ubuntu.mandatory.path', 'LC_MEASUREMENT': 'lzh_TW.UTF-8', 'VSCODE_GIT_ASKPASS_EXTRA_ARGS': '--ms-enable-electron-run-as-node', 'CHROME_DESKTOP': 'code-url-handler.desktop', 'COMPIZ_CONFIG_PROFILE': 'ubuntu', 'IM_CONFIG_PHASE': '1', 'PAPERSIZE': 'a4', 'PYTHONDONTWRITEBYTECODE': '1', 'GDMSESSION': 'ubuntu', '_CE_M': '', 'GTK2_MODULES': 'overlay-scrollbar', 'SESSIONTYPE': 'gnome-session', 'HOME': '/home/navy', 'XDG_SEAT': 'seat0', 'SHLVL': '4', 'VSCODE_GIT_ASKPASS_MAIN': '/usr/share/code/resources/app/extensions/git/dist/askpass-main.js', 'LANGUAGE': 'en_US', 'GNOME_DESKTOP_SESSION_ID': 'this-is-deprecated', 'LIBGL_ALWAYS_SOFTWARE': '1', 'CONDA_PYTHON_EXE': '/home/navy/miniconda3/bin/python', 'UPSTART_EVENTS': 'xsession started', 'LOGNAME': 'navy', 'XDG_SESSION_DESKTOP': 'ubuntu', 'PYTHONPATH': '/home/navy/PaddleSpeech:', 'COMPIZ_BIN_PATH': '/usr/bin/', 'KALDI_ROOT': '/home/navy/PaddleSpeech/tools/kaldi', 'VSCODE_GIT_IPC_HANDLE': '/run/user/1000/vscode-git-f991a59250.sock', 'DBUS_SESSION_BUS_ADDRESS': 'unix:abstract=/tmp/dbus-u75uREnUpU', 'XDG_DATA_DIRS': '/usr/share/ubuntu:/usr/share/gnome:/usr/local/share:/usr/share:/var/lib/snapd/desktop', 'QT4_IM_MODULE': 'fcitx', 'LESSOPEN': '| /usr/bin/lesspipe %s', 'CONDA_DEFAULT_ENV': 'base', 'VSCODE_GIT_ASKPASS_NODE': '/usr/share/code/code', 'GIT_ASKPASS': '/usr/share/code/resources/app/extensions/git/dist/askpass.sh', 'UPSTART_JOB': 'unity7', 'DISPLAY': ':0', 'XDG_RUNTIME_DIR': '/run/user/1000', 'XDG_CURRENT_DESKTOP': 'Unity', 'GTK_IM_MODULE': 'fcitx', 'LESSCLOSE': '/usr/bin/lesspipe %s %s', 'LC_TIME': 'lzh_TW.UTF-8', 'COLORTERM': 'truecolor', 'LC_NAME': 'lzhTW.UTF-8', 'XAUTHORITY': '/home/navy/.Xauthority', '': '/home/navy/miniconda3/bin/python3', 'CUSTOM_DEVICE_ROOT': '', 'OMP_NUM_THREADS': '1', 'POD_NAME': 'gyojcc', 'PADDLE_MASTER': '127.0.1.1:44854', 'PADDLE_GLOBAL_SIZE': '1', 'PADDLE_LOCAL_SIZE': '1', 'PADDLE_GLOBAL_RANK': '0', 'PADDLE_LOCAL_RANK': '0', 'PADDLE_NNODES': '1', 'PADDLE_TRAINER_ENDPOINTS': '127.0.1.1:44855', 'PADDLE_CURRENT_ENDPOINT': '127.0.1.1:44855', 'PADDLE_TRAINER_ID': '0', 'PADDLE_TRAINERS_NUM': '1', 'PADDLE_RANK_IN_NODE': '0', 'FLAGS_selected_gpus': '0'} ERROR 2022-12-12 18:37:36,860 controller.py:110] Container failed !!! Container rank 0 status failed cmd ['/home/navy/miniconda3/bin/python3', '-u', '/home/navy/PaddleSpeech/paddlespeech/s2t/exps/u2/bin/train.py', '--ngpu', '1', '--seed', '0', '--config', 'conf/transformer.yaml', '--output', 'exp/transformer', '--profiler-options', '', '--benchmark-batch-size', '0', '--benchmark-max-step', '0'] code -8 log log/workerlog.0 env {'LC_PAPER': 'lzh_TW.UTF-8', 'XDG_VTNR': '7', 'LC_ADDRESS': 'lzh_TW.UTF-8', 'XDG_SESSION_ID': 'c2', 'TERM_PROGRAM': 'vscode', 'LC_MONETARY': 'lzh_TW.UTF-8', 'XDG_GREETER_DATA_DIR': '/var/lib/lightdm-data/navy', 'CLUTTER_IM_MODULE': 'xim', 'GIO_LAUNCHED_DESKTOP_FILE_PID': '4148', 'SESSION': 'ubuntu', 'GPG_AGENT_INFO': '/home/navy/.gnupg/S.gpg-agent:0:1', 'TERM': 'xterm-256color', 'XDG_MENU_PREFIX': 'gnome-', 'SHELL': '/bin/bash', 'CONDA_SHLVL': '1', 'QT_LINUX_ACCESSIBILITY_ALWAYS_ON': '1', 'TERM_PROGRAM_VERSION': '1.74.0', 'CONDA_PROMPT_MODIFIER': '(base) ', 'LC_NUMERIC': 'lzh_TW.UTF-8', 'ORIGINAL_XDG_CURRENT_DESKTOP': 'Unity', 'UPSTART_SESSION': 'unix:abstract=/com/ubuntu/upstart-session/1000/1570', 'GTK_MODULES': 'gail:atk-bridge:unity-gtk-module', 'LC_ALL': 'C', 'USER': 'navy', 'PYTHONIOENCODING': 'UTF-8', 'SRILM': '/home/navy/PaddleSpeech/tools/srilm', 'LS_COLORS': 'rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:mi=00:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:.tar=01;31:.tgz=01;31:.arc=01;31:.arj=01;31:.taz=01;31:.lha=01;31:.lz4=01;31:.lzh=01;31:.lzma=01;31:.tlz=01;31:.txz=01;31:.tzo=01;31:.t7z=01;31:.zip=01;31:.z=01;31:.Z=01;31:.dz=01;31:.gz=01;31:.lrz=01;31:.lz=01;31:.lzo=01;31:.xz=01;31:.bz2=01;31:.bz=01;31:.tbz=01;31:.tbz2=01;31:.tz=01;31:.deb=01;31:.rpm=01;31:.jar=01;31:.war=01;31:.ear=01;31:.sar=01;31:.rar=01;31:.alz=01;31:.ace=01;31:.zoo=01;31:.cpio=01;31:.7z=01;31:.rz=01;31:.cab=01;31:.jpg=01;35:.jpeg=01;35:.gif=01;35:.bmp=01;35:.pbm=01;35:.pgm=01;35:.ppm=01;35:.tga=01;35:.xbm=01;35:.xpm=01;35:.tif=01;35:.tiff=01;35:.png=01;35:.svg=01;35:.svgz=01;35:.mng=01;35:.pcx=01;35:.mov=01;35:.mpg=01;35:.mpeg=01;35:.m2v=01;35:.mkv=01;35:.webm=01;35:.ogm=01;35:.mp4=01;35:.m4v=01;35:.mp4v=01;35:.vob=01;35:.qt=01;35:.nuv=01;35:.wmv=01;35:.asf=01;35:.rm=01;35:.rmvb=01;35:.flc=01;35:.avi=01;35:.fli=01;35:.flv=01;35:.gl=01;35:.dl=01;35:.xcf=01;35:.xwd=01;35:.yuv=01;35:.cgm=01;35:.emf=01;35:.ogv=01;35:.ogx=01;35:.aac=00;36:.au=00;36:.flac=00;36:.m4a=00;36:.mid=00;36:.midi=00;36:.mka=00;36:.mp3=00;36:.mpc=00;36:.ogg=00;36:.ra=00;36:.wav=00;36:.oga=00;36:.opus=00;36:.spx=00;36:.xspf=00;36:', 'LD_LIBRARY_PATH': '/usr/local/cuda-10.2/lib64:/usr/local/cuda-10.2/lib64::/usr/local/lib/:/home/navy/PaddleSpeech/tools/liblbfgs-1.10/lib/.libs', 'LC_TELEPHONE': 'lzh_TW.UTF-8', 'QT_ACCESSIBILITY': '1', 'BIN_DIR': '/home/navy/PaddleSpeech/paddlespeech/s2t/exps/u2/bin', 'CONDA_EXE': '/home/navy/miniconda3/bin/conda', 'UNITY_HAS_3D_SUPPORT': 'false', 'XDG_SESSION_PATH': '/org/freedesktop/DisplayManager/Session0', 'XDG_SEAT_PATH': '/org/freedesktop/DisplayManager/Seat0', 'LIBLBFGS': '/home/navy/PaddleSpeech/tools/liblbfgs-1.10', 'SSH_AUTH_SOCK': '/run/user/1000/keyring/ssh', 'MAIN_ROOT': '/home/navy/PaddleSpeech', 'SESSION_MANAGER': 'local/navy:@/tmp/.ICE-unix/1817,unix/navy:/tmp/.ICE-unix/1817', 'DEFAULTS_PATH': '/usr/share/gconf/ubuntu.default.path', 'FLAGS_allocator_strategy': 'naive_best_fit', 'GIO_LAUNCHED_DESKTOP_FILE': '/usr/share/applications/code.desktop', '_CE_CONDA': '', 'UNITY_DEFAULT_PROFILE': 'unity-lowgfx', 'XDG_CONFIG_DIRS': '/etc/xdg/xdg-ubuntu:/usr/share/upstart/xdg:/etc/xdg', 'DESKTOP_SESSION': 'ubuntu', 'PATH': '/home/navy/PaddleSpeech/examples/aishell/asr1/utils/:/home/navy/PaddleSpeech/tools/kaldi/tools/openfst/bin:/home/navy/PaddleSpeech/examples/aishell/asr1:/home/navy/PaddleSpeech:/home/navy/PaddleSpeech/utils:/usr/local/cuda-10.2/bin:/home/navy/bin:/home/navy/.local/bin:/usr/local/cuda-10.2/bin:/home/navy/miniconda3/bin:/home/navy/miniconda3/condabin:/home/navy/bin:/home/navy/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/home/navy/PaddleSpeech/tools/srilm/bin:/home/navy/PaddleSpeech/tools/srilm/bin/i686-m64', 'QT_QPA_PLATFORMTHEME': 'appmenu-qt5', 'QT_IM_MODULE': 'fcitx', 'CONDA_PREFIX': '/home/navy/miniconda3', 'LC_IDENTIFICATION': 'lzh_TW.UTF-8', 'JOB': 'unity-settings-daemon', 'XDG_SESSION_TYPE': 'x11', 'PWD': '/home/navy/PaddleSpeech/examples/aishell/asr1', 'XMODIFIERS': '@im=fcitx', 'CUDA_VISIBLE_DEVICES': '0', 'LANG': 'en_US.UTF-8', 'GDK_BACKEND': 'x11', 'GDM_LANG': 'en_US', 'MANDATORY_PATH': '/usr/share/gconf/ubuntu.mandatory.path', 'LC_MEASUREMENT': 'lzh_TW.UTF-8', 'VSCODE_GIT_ASKPASS_EXTRA_ARGS': '--ms-enable-electron-run-as-node', 'CHROME_DESKTOP': 'code-url-handler.desktop', 'COMPIZ_CONFIG_PROFILE': 'ubuntu', 'IM_CONFIG_PHASE': '1', 'PAPERSIZE': 'a4', 'PYTHONDONTWRITEBYTECODE': '1', 'GDMSESSION': 'ubuntu', '_CE_M': '', 'GTK2_MODULES': 'overlay-scrollbar', 'SESSIONTYPE': 'gnome-session', 'HOME': '/home/navy', 'XDG_SEAT': 'seat0', 'SHLVL': '4', 'VSCODE_GIT_ASKPASS_MAIN': '/usr/share/code/resources/app/extensions/git/dist/askpass-main.js', 'LANGUAGE': 'en_US', 'GNOME_DESKTOP_SESSION_ID': 'this-is-deprecated', 'LIBGL_ALWAYS_SOFTWARE': '1', 'CONDA_PYTHON_EXE': '/home/navy/miniconda3/bin/python', 'UPSTART_EVENTS': 'xsession started', 'LOGNAME': 'navy', 'XDG_SESSION_DESKTOP': 'ubuntu', 'PYTHONPATH': '/home/navy/PaddleSpeech:', 'COMPIZ_BIN_PATH': '/usr/bin/', 'KALDI_ROOT': '/home/navy/PaddleSpeech/tools/kaldi', 'VSCODE_GIT_IPC_HANDLE': '/run/user/1000/vscode-git-f991a59250.sock', 'DBUS_SESSION_BUS_ADDRESS': 'unix:abstract=/tmp/dbus-u75uREnUpU', 'XDG_DATA_DIRS': '/usr/share/ubuntu:/usr/share/gnome:/usr/local/share:/usr/share:/var/lib/snapd/desktop', 'QT4_IM_MODULE': 'fcitx', 'LESSOPEN': '| /usr/bin/lesspipe %s', 'CONDA_DEFAULT_ENV': 'base', 'VSCODE_GIT_ASKPASS_NODE': '/usr/share/code/code', 'GIT_ASKPASS': '/usr/share/code/resources/app/extensions/git/dist/askpass.sh', 'UPSTART_JOB': 'unity7', 'DISPLAY': ':0', 'XDG_RUNTIME_DIR': '/run/user/1000', 'XDG_CURRENT_DESKTOP': 'Unity', 'GTK_IM_MODULE': 'fcitx', 'LESSCLOSE': '/usr/bin/lesspipe %s %s', 'LC_TIME': 'lzh_TW.UTF-8', 'COLORTERM': 'truecolor', 'LC_NAME': 'lzhTW.UTF-8', 'XAUTHORITY': '/home/navy/.Xauthority', '': '/home/navy/miniconda3/bin/python3', 'CUSTOM_DEVICE_ROOT': '', 'OMP_NUM_THREADS': '1', 'POD_NAME': 'gyojcc', 'PADDLE_MASTER': '127.0.1.1:44854', 'PADDLE_GLOBAL_SIZE': '1', 'PADDLE_LOCAL_SIZE': '1', 'PADDLE_GLOBAL_RANK': '0', 'PADDLE_LOCAL_RANK': '0', 'PADDLE_NNODES': '1', 'PADDLE_TRAINER_ENDPOINTS': '127.0.1.1:44855', 'PADDLE_CURRENT_ENDPOINT': '127.0.1.1:44855', 'PADDLE_TRAINER_ID': '0', 'PADDLE_TRAINERS_NUM': '1', 'PADDLE_RANK_IN_NODE': '0', 'FLAGS_selected_gpus': '0'} LAUNCH INFO 2022-12-12 18:37:36,861 ------------------------- ERROR LOG DETAIL ------------------------- INFO 2022-12-12 18:37:36,861 controller.py:111] ------------------------- ERROR LOG DETAIL ------------------------- | INFO | paddlespeech.s2t.utils.dynamic_import:instance_class:67 - Instance: Adam {'grad_clip': ClipGradByGlobalNormWithLog(global_clip_norm=5.0), 'weight_decay': <paddle.regularizer.L2Decay object at 0x7f9ebf701f70>, 'learning_rate': WarmupLR(warmup_steps=25000, lr=0.002, last_epoch=0)}. 2022-12-12 18:37:34.052 | INFO | paddlespeech.s2t.training.optimizer:from_args:119 - LR: WarmupLR(warmup_steps=25000, lr=0.002, last_epoch=0) 2022-12-12 18:37:34.052 | INFO | paddlespeech.s2t.exps.u2.model:setup_model:308 - Setup optimizer/lr_scheduler! 2022-12-12 18:37:34.052 | INFO | paddlespeech.s2t.training.trainer:resume_or_scratch:221 - Init from scratch! 2022-12-12 18:37:34.287 | INFO | paddlespeech.s2t.utils.checkpoint:_save_parameters:286 - Saved model to exp/transformer/checkpoints/init.pdparams 2022-12-12 18:37:34.288 | INFO | paddlespeech.s2t.utils.checkpoint:_save_parameters:292 - Saved optimzier state to exp/transformer/checkpoints/init.pdopt 2022-12-12 18:37:34.288 | INFO | paddlespeech.s2t.exps.u2.model:do_train:160 - Train Total Examples: 1877 /home/navy/PaddleSpeech/paddlespeech/audio/transform/spec_augment.py:49: DeprecationWarning: BICUBIC is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BICUBIC instead. Image.BICUBIC) /home/navy/PaddleSpeech/paddlespeech/audio/transform/spec_augment.py:51: DeprecationWarning: BICUBIC is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BICUBIC instead. Image.BICUBIC) /home/navy/PaddleSpeech/paddlespeech/audio/transform/spec_augment.py:49: DeprecationWarning: BICUBIC is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BICUBIC instead. Image.BICUBIC) /home/navy/PaddleSpeech/paddlespeech/audio/transform/spec_augment.py:51: DeprecationWarning: BICUBIC is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BICUBIC instead. Image.BICUBIC)


C++ Traceback (most recent call last):

0 arange_ad_func(paddle::experimental::Tensor const&, paddle::experimental::Tensor const&, paddle::experimental::Tensor const&, paddle::experimental::DataType, phi::Place) 1 paddle::experimental::arange(paddle::experimental::Tensor const&, paddle::experimental::Tensor const&, paddle::experimental::Tensor const&, paddle::experimental::DataType, phi::Place const&) 2 void phi::ArangeKernel<long, phi::GPUContext>(phi::GPUContext const&, phi::DenseTensor const&, phi::DenseTensor const&, phi::DenseTensor const&, phi::DenseTensor*)


Error Message Summary:

FatalError: Erroneous arithmetic operation is detected by the operating system. [TimeInfo: Aborted at 1670841456 (unix time) try "date -d @1670841456" if you are using GNU date ] [SignalInfo: SIGFPE (@0x7f9f5575decd) received by PID 5162 (TID 0x7f9fcb8b4700) from PID 1433788109 ]

LAUNCH INFO 2022-12-12 18:37:36,861 Exit code -8 INFO 2022-12-12 18:37:36,861 controller.py:141] Exit code -8

zxcd commented 1 year ago

我使用的python版本是3.7.0,或许你可以尝试跟我保持一致尝试一下。

navy7913 commented 1 year ago

2. bash run.sh --stage 0 --stop_stage 1 --conf_path conf/transformer.yaml

您好,我将pytohn版本调整至3.7.0后问题还是一样没有办法训练 2022-12-17 13:39:31.508 | DEBUG | paddlespeech.s2t::41 - register user softmax to paddle, remove this when fixed! 2022-12-17 13:39:31.508 | DEBUG | paddlespeech.s2t::45 - register user log_softmax to paddle, remove this when fixed! 2022-12-17 13:39:31.508 | DEBUG | paddlespeech.s2t::49 - register user sigmoid to paddle, remove this when fixed! 2022-12-17 13:39:31.508 | DEBUG | paddlespeech.s2t::53 - register user log_sigmoid to paddle, remove this when fixed! 2022-12-17 13:39:31.509 | DEBUG | paddlespeech.s2t::57 - register user relu to paddle, remove this when fixed! 2022-12-17 13:39:31.509 | DEBUG | paddlespeech.s2t::67 - override cat of paddle if exists or register, remove this when fixed! 2022-12-17 13:39:31.509 | DEBUG | paddlespeech.s2t::89 - override long of paddle.Tensor if exists or register, remove this when fixed! 2022-12-17 13:39:31.509 | DEBUG | paddlespeech.s2t::111 - override new_full of paddle.Tensor if exists or register, remove this when fixed! 2022-12-17 13:39:31.509 | DEBUG | paddlespeech.s2t::123 - override contiguous of paddle.Tensor if exists or register, remove this when fixed! 2022-12-17 13:39:31.509 | DEBUG | paddlespeech.s2t::134 - register user view to paddle.Tensor, remove this when fixed! 2022-12-17 13:39:31.510 | DEBUG | paddlespeech.s2t::145 - register user view_as to paddle.Tensor, remove this when fixed! 2022-12-17 13:39:31.510 | DEBUG | paddlespeech.s2t::186 - register user masked_fill to paddle.Tensor, remove this when fixed! 2022-12-17 13:39:31.510 | DEBUG | paddlespeech.s2t::205 - register user maskedfill to paddle.Tensor, remove this when fixed! 2022-12-17 13:39:31.510 | DEBUG | paddlespeech.s2t::229 - register user repeat to paddle.Tensor, remove this when fixed! 2022-12-17 13:39:31.510 | DEBUG | paddlespeech.s2t::235 - register user softmax to paddle.Tensor, remove this when fixed! 2022-12-17 13:39:31.510 | DEBUG | paddlespeech.s2t::240 - register user sigmoid to paddle.Tensor, remove this when fixed! 2022-12-17 13:39:31.510 | DEBUG | paddlespeech.s2t::244 - register user relu to paddle.Tensor, remove this when fixed! 2022-12-17 13:39:31.511 | DEBUG | paddlespeech.s2t::254 - register user type_as to paddle.Tensor, remove this when fixed! 2022-12-17 13:39:31.511 | DEBUG | paddlespeech.s2t::270 - register user to to paddle.Tensor, remove this when fixed! 2022-12-17 13:39:31.511 | DEBUG | paddlespeech.s2t::281 - register user float to paddle.Tensor, remove this when fixed! 2022-12-17 13:39:31.511 | DEBUG | paddlespeech.s2t::291 - register user int to paddle.Tensor, remove this when fixed! 2022-12-17 13:39:34.443 | INFO | paddlespeech.s2t.utils.utility:all_version:45 - Deps Module Version:[('python', '3.7.0 (default, Oct 9 2018, 10:31:47) \n[GCC 7.3.0]'), ('paddle', '2.4.0-rc0'), ('paddle_commit', '083853cd4e4a9bdad22c70fa48eb9a036d2def27'), ('soundfile', '0.11.0')] 2022-12-17 13:39:34.443 | INFO | paddlespeech.s2t.training.trainer:init:116 - Rank: 0/1 2022-12-17 13:39:36.385 | INFO | paddlespeech.s2t.io.batchfy:make_batchset:400 - count is auto detected as seq 2022-12-17 13:39:36.687 | INFO | paddlespeech.s2t.io.batchfy:make_batchset:424 - # utts: 120098 2022-12-17 13:39:36.773 | INFO | paddlespeech.s2t.io.batchfy:make_batchset:467 - # minibatches: 30025 2022-12-17 13:39:36.947 | WARNING | paddlespeech.s2t.io.reader:init:76 - [Experimental feature] Some preprocessing will be done for the mini-batch creation using Transformation( 0: LogMelSpectrogramKaldi(fs=16000, n_mels=80, n_frame_shift=10.0, n_frame_length=25.0, dither=0.1)) 1: GlobalCMVN( cmvn_path=data/mean_std.json, norm_means=True, norm_vars=True,) 2: TimeWarp(max_time_warp=5, inplace=True, mode=PIL) 3: FreqMask(F=30, n_mask=2, replace_with_zero=False, inplace=True) 4: TimeMask(T=40, n_mask=2, replace_with_zero=False, inplace=True)) 2022-12-17 13:39:37.122 | INFO | paddlespeech.s2t.io.batchfy:make_batchset:400 - count is auto detected as seq 2022-12-17 13:39:37.141 | INFO | paddlespeech.s2t.io.batchfy:make_batchset:424 - # utts: 14326 2022-12-17 13:39:37.153 | INFO | paddlespeech.s2t.io.batchfy:make_batchset:467 - # minibatches: 3582 2022-12-17 13:39:37.158 | WARNING | paddlespeech.s2t.io.reader:init:76 - [Experimental feature] Some preprocessing will be done for the mini-batch creation using Transformation( 0: LogMelSpectrogramKaldi(fs=16000, n_mels=80, n_frame_shift=10.0, n_frame_length=25.0, dither=0.1)) 1: GlobalCMVN( cmvn_path=data/mean_std.json, norm_means=True, norm_vars=True,) 2: TimeWarp(max_time_warp=5, inplace=True, mode=PIL) 3: FreqMask(F=30, n_mask=2, replace_with_zero=False, inplace=True) 4: TimeMask(T=40, n_mask=2, replace_with_zero=False, inplace=True)) 2022-12-17 13:39:37.158 | INFO | paddlespeech.s2t.exps.u2.model:setup_dataloader:233 - Setup train/valid Dataloader! 2022-12-17 13:39:37.159 | DEBUG | paddlespeech.s2t.models.u2.u2:_init_from_config:901 - U2 Encoder type: transformer 2022-12-17 13:39:37.300 | DEBUG | paddlespeech.s2t.models.u2.u2:_init_from_config:913 - U2 Decoder type: transformer 2022-12-17 13:39:37.416 | DEBUG | paddlespeech.s2t.modules.loss:init:41 - CTCLoss Loss reduction: sum, div-bs: True 2022-12-17 13:39:37.417 | DEBUG | paddlespeech.s2t.modules.loss:init:42 - CTCLoss Grad Norm Type: None 2022-12-17 13:39:37.417 | DEBUG | paddlespeech.s2t.modules.loss:init:74 - CTCLoss() kwargs:{'norm_by_times': False}, not support: {'norm_by_batchsize': False, 'norm_by_total_logits_len': False} 2022-12-17 13:39:37.421 | INFO | paddlespeech.s2t.exps.u2.model:setup_model:259 - U2Model( (encoder): TransformerEncoder( (embed): Conv2dSubsampling4( (pos_enc): PositionalEncoding( (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) ) (conv): Sequential( (0): Conv2D(1, 256, kernel_size=[3, 3], stride=[2, 2], data_format=NCHW) (1): ReLU() (2): Conv2D(256, 256, kernel_size=[3, 3], stride=[2, 2], data_format=NCHW) (3): ReLU() ) (out): Sequential( (0): Linear(in_features=4864, out_features=256, dtype=float32) ) ) (after_norm): LayerNorm(normalized_shape=[256], epsilon=1e-12) (encoders): LayerList( (0): TransformerEncoderLayer( (self_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (feed_forward): PositionwiseFeedForward( (w_1): Linear(in_features=256, out_features=2048, dtype=float32) (activation): ReLU() (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (w_2): Linear(in_features=2048, out_features=256, dtype=float32) ) (norm1): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm2): LayerNorm(normalized_shape=[256], epsilon=1e-12) (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (concat_linear): Linear(in_features=512, out_features=256, dtype=float32) ) (1): TransformerEncoderLayer( (self_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (feed_forward): PositionwiseFeedForward( (w_1): Linear(in_features=256, out_features=2048, dtype=float32) (activation): ReLU() (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (w_2): Linear(in_features=2048, out_features=256, dtype=float32) ) (norm1): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm2): LayerNorm(normalized_shape=[256], epsilon=1e-12) (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (concat_linear): Linear(in_features=512, out_features=256, dtype=float32) ) (2): TransformerEncoderLayer( (self_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (feed_forward): PositionwiseFeedForward( (w_1): Linear(in_features=256, out_features=2048, dtype=float32) (activation): ReLU() (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (w_2): Linear(in_features=2048, out_features=256, dtype=float32) ) (norm1): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm2): LayerNorm(normalized_shape=[256], epsilon=1e-12) (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (concat_linear): Linear(in_features=512, out_features=256, dtype=float32) ) (3): TransformerEncoderLayer( (self_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (feed_forward): PositionwiseFeedForward( (w_1): Linear(in_features=256, out_features=2048, dtype=float32) (activation): ReLU() (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (w_2): Linear(in_features=2048, out_features=256, dtype=float32) ) (norm1): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm2): LayerNorm(normalized_shape=[256], epsilon=1e-12) (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (concat_linear): Linear(in_features=512, out_features=256, dtype=float32) ) (4): TransformerEncoderLayer( (self_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (feed_forward): PositionwiseFeedForward( (w_1): Linear(in_features=256, out_features=2048, dtype=float32) (activation): ReLU() (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (w_2): Linear(in_features=2048, out_features=256, dtype=float32) ) (norm1): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm2): LayerNorm(normalized_shape=[256], epsilon=1e-12) (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (concat_linear): Linear(in_features=512, out_features=256, dtype=float32) ) (5): TransformerEncoderLayer( (self_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (feed_forward): PositionwiseFeedForward( (w_1): Linear(in_features=256, out_features=2048, dtype=float32) (activation): ReLU() (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (w_2): Linear(in_features=2048, out_features=256, dtype=float32) ) (norm1): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm2): LayerNorm(normalized_shape=[256], epsilon=1e-12) (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (concat_linear): Linear(in_features=512, out_features=256, dtype=float32) ) (6): TransformerEncoderLayer( (self_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (feed_forward): PositionwiseFeedForward( (w_1): Linear(in_features=256, out_features=2048, dtype=float32) (activation): ReLU() (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (w_2): Linear(in_features=2048, out_features=256, dtype=float32) ) (norm1): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm2): LayerNorm(normalized_shape=[256], epsilon=1e-12) (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (concat_linear): Linear(in_features=512, out_features=256, dtype=float32) ) (7): TransformerEncoderLayer( (self_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (feed_forward): PositionwiseFeedForward( (w_1): Linear(in_features=256, out_features=2048, dtype=float32) (activation): ReLU() (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (w_2): Linear(in_features=2048, out_features=256, dtype=float32) ) (norm1): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm2): LayerNorm(normalized_shape=[256], epsilon=1e-12) (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (concat_linear): Linear(in_features=512, out_features=256, dtype=float32) ) (8): TransformerEncoderLayer( (self_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (feed_forward): PositionwiseFeedForward( (w_1): Linear(in_features=256, out_features=2048, dtype=float32) (activation): ReLU() (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (w_2): Linear(in_features=2048, out_features=256, dtype=float32) ) (norm1): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm2): LayerNorm(normalized_shape=[256], epsilon=1e-12) (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (concat_linear): Linear(in_features=512, out_features=256, dtype=float32) ) (9): TransformerEncoderLayer( (self_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (feed_forward): PositionwiseFeedForward( (w_1): Linear(in_features=256, out_features=2048, dtype=float32) (activation): ReLU() (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (w_2): Linear(in_features=2048, out_features=256, dtype=float32) ) (norm1): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm2): LayerNorm(normalized_shape=[256], epsilon=1e-12) (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (concat_linear): Linear(in_features=512, out_features=256, dtype=float32) ) (10): TransformerEncoderLayer( (self_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (feed_forward): PositionwiseFeedForward( (w_1): Linear(in_features=256, out_features=2048, dtype=float32) (activation): ReLU() (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (w_2): Linear(in_features=2048, out_features=256, dtype=float32) ) (norm1): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm2): LayerNorm(normalized_shape=[256], epsilon=1e-12) (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (concat_linear): Linear(in_features=512, out_features=256, dtype=float32) ) (11): TransformerEncoderLayer( (self_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (feed_forward): PositionwiseFeedForward( (w_1): Linear(in_features=256, out_features=2048, dtype=float32) (activation): ReLU() (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (w_2): Linear(in_features=2048, out_features=256, dtype=float32) ) (norm1): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm2): LayerNorm(normalized_shape=[256], epsilon=1e-12) (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (concat_linear): Linear(in_features=512, out_features=256, dtype=float32) ) ) ) (decoder): TransformerDecoder( (embed): Sequential( (0): Embedding(4233, 256, sparse=False) (1): PositionalEncoding( (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) ) ) (after_norm): LayerNorm(normalized_shape=[256], epsilon=1e-12) (output_layer): Linear(in_features=256, out_features=4233, dtype=float32) (decoders): LayerList( (0): DecoderLayer( (self_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (src_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (feed_forward): PositionwiseFeedForward( (w_1): Linear(in_features=256, out_features=2048, dtype=float32) (activation): ReLU() (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (w_2): Linear(in_features=2048, out_features=256, dtype=float32) ) (norm1): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm2): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm3): LayerNorm(normalized_shape=[256], epsilon=1e-12) (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (concat_linear1): Linear(in_features=512, out_features=256, dtype=float32) (concat_linear2): Linear(in_features=512, out_features=256, dtype=float32) ) (1): DecoderLayer( (self_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (src_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (feed_forward): PositionwiseFeedForward( (w_1): Linear(in_features=256, out_features=2048, dtype=float32) (activation): ReLU() (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (w_2): Linear(in_features=2048, out_features=256, dtype=float32) ) (norm1): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm2): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm3): LayerNorm(normalized_shape=[256], epsilon=1e-12) (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (concat_linear1): Linear(in_features=512, out_features=256, dtype=float32) (concat_linear2): Linear(in_features=512, out_features=256, dtype=float32) ) (2): DecoderLayer( (self_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (src_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (feed_forward): PositionwiseFeedForward( (w_1): Linear(in_features=256, out_features=2048, dtype=float32) (activation): ReLU() (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (w_2): Linear(in_features=2048, out_features=256, dtype=float32) ) (norm1): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm2): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm3): LayerNorm(normalized_shape=[256], epsilon=1e-12) (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (concat_linear1): Linear(in_features=512, out_features=256, dtype=float32) (concat_linear2): Linear(in_features=512, out_features=256, dtype=float32) ) (3): DecoderLayer( (self_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (src_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (feed_forward): PositionwiseFeedForward( (w_1): Linear(in_features=256, out_features=2048, dtype=float32) (activation): ReLU() (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (w_2): Linear(in_features=2048, out_features=256, dtype=float32) ) (norm1): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm2): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm3): LayerNorm(normalized_shape=[256], epsilon=1e-12) (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (concat_linear1): Linear(in_features=512, out_features=256, dtype=float32) (concat_linear2): Linear(in_features=512, out_features=256, dtype=float32) ) (4): DecoderLayer( (self_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (src_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (feed_forward): PositionwiseFeedForward( (w_1): Linear(in_features=256, out_features=2048, dtype=float32) (activation): ReLU() (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (w_2): Linear(in_features=2048, out_features=256, dtype=float32) ) (norm1): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm2): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm3): LayerNorm(normalized_shape=[256], epsilon=1e-12) (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (concat_linear1): Linear(in_features=512, out_features=256, dtype=float32) (concat_linear2): Linear(in_features=512, out_features=256, dtype=float32) ) (5): DecoderLayer( (self_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (src_attn): MultiHeadedAttention( (linear_q): Linear(in_features=256, out_features=256, dtype=float32) (linear_k): Linear(in_features=256, out_features=256, dtype=float32) (linear_v): Linear(in_features=256, out_features=256, dtype=float32) (linear_out): Linear(in_features=256, out_features=256, dtype=float32) (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) ) (feed_forward): PositionwiseFeedForward( (w_1): Linear(in_features=256, out_features=2048, dtype=float32) (activation): ReLU() (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (w_2): Linear(in_features=2048, out_features=256, dtype=float32) ) (norm1): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm2): LayerNorm(normalized_shape=[256], epsilon=1e-12) (norm3): LayerNorm(normalized_shape=[256], epsilon=1e-12) (dropout): Dropout(p=0.1, axis=None, mode=upscale_in_train) (concat_linear1): Linear(in_features=512, out_features=256, dtype=float32) (concat_linear2): Linear(in_features=512, out_features=256, dtype=float32) ) ) ) (ctc): CTCDecoderBase( (dropout): Dropout(p=0.0, axis=None, mode=upscale_in_train) (ctc_lo): Linear(in_features=256, out_features=4233, dtype=float32) (criterion): CTCLoss( (loss): CTCLoss() ) ) (criterion_att): LabelSmoothingLoss( (criterion): KLDivLoss() ) ) 2022-12-17 13:39:37.421 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.embed.conv.0.weight | [256, 1, 3, 3] | 2304 | True 2022-12-17 13:39:37.422 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.embed.conv.0.bias | [256] | 256 | True 2022-12-17 13:39:37.422 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.embed.conv.2.weight | [256, 256, 3, 3] | 589824 | True 2022-12-17 13:39:37.422 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.embed.conv.2.bias | [256] | 256 | True 2022-12-17 13:39:37.423 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.embed.out.0.weight | [4864, 256] | 1245184 | True 2022-12-17 13:39:37.423 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.embed.out.0.bias | [256] | 256 | True 2022-12-17 13:39:37.423 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.after_norm.weight | [256] | 256 | True 2022-12-17 13:39:37.423 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.after_norm.bias | [256] | 256 | True 2022-12-17 13:39:37.424 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.0.self_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.424 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.0.self_attn.linear_q.bias | [256] | 256 | True 2022-12-17 13:39:37.425 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.0.self_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.425 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.0.self_attn.linear_k.bias | [256] | 256 | True 2022-12-17 13:39:37.425 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.0.self_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.425 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.0.self_attn.linear_v.bias | [256] | 256 | True 2022-12-17 13:39:37.426 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.0.self_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.426 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.0.self_attn.linear_out.bias | [256] | 256 | True 2022-12-17 13:39:37.426 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.0.feed_forward.w_1.weight | [256, 2048] | 524288 | True 2022-12-17 13:39:37.427 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.0.feed_forward.w_1.bias | [2048] | 2048 | True 2022-12-17 13:39:37.427 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.0.feed_forward.w_2.weight | [2048, 256] | 524288 | True 2022-12-17 13:39:37.427 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.0.feed_forward.w_2.bias | [256] | 256 | True 2022-12-17 13:39:37.428 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.0.norm1.weight | [256] | 256 | True 2022-12-17 13:39:37.428 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.0.norm1.bias | [256] | 256 | True 2022-12-17 13:39:37.429 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.0.norm2.weight | [256] | 256 | True 2022-12-17 13:39:37.429 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.0.norm2.bias | [256] | 256 | True 2022-12-17 13:39:37.429 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.0.concat_linear.weight | [512, 256] | 131072 | True 2022-12-17 13:39:37.430 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.0.concat_linear.bias | [256] | 256 | True 2022-12-17 13:39:37.430 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.1.self_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.430 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.1.self_attn.linear_q.bias | [256] | 256 | True 2022-12-17 13:39:37.431 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.1.self_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.431 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.1.self_attn.linear_k.bias | [256] | 256 | True 2022-12-17 13:39:37.431 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.1.self_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.432 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.1.self_attn.linear_v.bias | [256] | 256 | True 2022-12-17 13:39:37.432 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.1.self_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.432 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.1.self_attn.linear_out.bias | [256] | 256 | True 2022-12-17 13:39:37.433 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.1.feed_forward.w_1.weight | [256, 2048] | 524288 | True 2022-12-17 13:39:37.433 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.1.feed_forward.w_1.bias | [2048] | 2048 | True 2022-12-17 13:39:37.433 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.1.feed_forward.w_2.weight | [2048, 256] | 524288 | True 2022-12-17 13:39:37.434 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.1.feed_forward.w_2.bias | [256] | 256 | True 2022-12-17 13:39:37.434 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.1.norm1.weight | [256] | 256 | True 2022-12-17 13:39:37.434 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.1.norm1.bias | [256] | 256 | True 2022-12-17 13:39:37.435 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.1.norm2.weight | [256] | 256 | True 2022-12-17 13:39:37.435 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.1.norm2.bias | [256] | 256 | True 2022-12-17 13:39:37.435 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.1.concat_linear.weight | [512, 256] | 131072 | True 2022-12-17 13:39:37.436 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.1.concat_linear.bias | [256] | 256 | True 2022-12-17 13:39:37.436 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.2.self_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.436 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.2.self_attn.linear_q.bias | [256] | 256 | True 2022-12-17 13:39:37.437 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.2.self_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.437 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.2.self_attn.linear_k.bias | [256] | 256 | True 2022-12-17 13:39:37.437 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.2.self_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.438 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.2.self_attn.linear_v.bias | [256] | 256 | True 2022-12-17 13:39:37.438 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.2.self_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.438 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.2.self_attn.linear_out.bias | [256] | 256 | True 2022-12-17 13:39:37.439 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.2.feed_forward.w_1.weight | [256, 2048] | 524288 | True 2022-12-17 13:39:37.439 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.2.feed_forward.w_1.bias | [2048] | 2048 | True 2022-12-17 13:39:37.440 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.2.feed_forward.w_2.weight | [2048, 256] | 524288 | True 2022-12-17 13:39:37.440 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.2.feed_forward.w_2.bias | [256] | 256 | True 2022-12-17 13:39:37.440 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.2.norm1.weight | [256] | 256 | True 2022-12-17 13:39:37.440 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.2.norm1.bias | [256] | 256 | True 2022-12-17 13:39:37.441 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.2.norm2.weight | [256] | 256 | True 2022-12-17 13:39:37.441 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.2.norm2.bias | [256] | 256 | True 2022-12-17 13:39:37.442 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.2.concat_linear.weight | [512, 256] | 131072 | True 2022-12-17 13:39:37.442 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.2.concat_linear.bias | [256] | 256 | True 2022-12-17 13:39:37.442 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.3.self_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.442 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.3.self_attn.linear_q.bias | [256] | 256 | True 2022-12-17 13:39:37.443 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.3.self_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.443 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.3.self_attn.linear_k.bias | [256] | 256 | True 2022-12-17 13:39:37.443 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.3.self_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.444 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.3.self_attn.linear_v.bias | [256] | 256 | True 2022-12-17 13:39:37.444 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.3.self_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.444 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.3.self_attn.linear_out.bias | [256] | 256 | True 2022-12-17 13:39:37.445 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.3.feed_forward.w_1.weight | [256, 2048] | 524288 | True 2022-12-17 13:39:37.445 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.3.feed_forward.w_1.bias | [2048] | 2048 | True 2022-12-17 13:39:37.445 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.3.feed_forward.w_2.weight | [2048, 256] | 524288 | True 2022-12-17 13:39:37.446 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.3.feed_forward.w_2.bias | [256] | 256 | True 2022-12-17 13:39:37.446 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.3.norm1.weight | [256] | 256 | True 2022-12-17 13:39:37.446 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.3.norm1.bias | [256] | 256 | True 2022-12-17 13:39:37.446 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.3.norm2.weight | [256] | 256 | True 2022-12-17 13:39:37.447 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.3.norm2.bias | [256] | 256 | True 2022-12-17 13:39:37.447 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.3.concat_linear.weight | [512, 256] | 131072 | True 2022-12-17 13:39:37.447 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.3.concat_linear.bias | [256] | 256 | True 2022-12-17 13:39:37.448 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.4.self_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.448 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.4.self_attn.linear_q.bias | [256] | 256 | True 2022-12-17 13:39:37.448 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.4.self_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.449 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.4.self_attn.linear_k.bias | [256] | 256 | True 2022-12-17 13:39:37.449 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.4.self_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.449 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.4.self_attn.linear_v.bias | [256] | 256 | True 2022-12-17 13:39:37.450 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.4.self_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.450 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.4.self_attn.linear_out.bias | [256] | 256 | True 2022-12-17 13:39:37.450 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.4.feed_forward.w_1.weight | [256, 2048] | 524288 | True 2022-12-17 13:39:37.450 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.4.feed_forward.w_1.bias | [2048] | 2048 | True 2022-12-17 13:39:37.451 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.4.feed_forward.w_2.weight | [2048, 256] | 524288 | True 2022-12-17 13:39:37.451 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.4.feed_forward.w_2.bias | [256] | 256 | True 2022-12-17 13:39:37.452 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.4.norm1.weight | [256] | 256 | True 2022-12-17 13:39:37.452 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.4.norm1.bias | [256] | 256 | True 2022-12-17 13:39:37.452 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.4.norm2.weight | [256] | 256 | True 2022-12-17 13:39:37.453 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.4.norm2.bias | [256] | 256 | True 2022-12-17 13:39:37.453 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.4.concat_linear.weight | [512, 256] | 131072 | True 2022-12-17 13:39:37.453 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.4.concat_linear.bias | [256] | 256 | True 2022-12-17 13:39:37.454 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.5.self_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.454 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.5.self_attn.linear_q.bias | [256] | 256 | True 2022-12-17 13:39:37.455 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.5.self_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.455 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.5.self_attn.linear_k.bias | [256] | 256 | True 2022-12-17 13:39:37.455 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.5.self_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.456 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.5.self_attn.linear_v.bias | [256] | 256 | True 2022-12-17 13:39:37.456 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.5.self_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.456 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.5.self_attn.linear_out.bias | [256] | 256 | True 2022-12-17 13:39:37.457 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.5.feed_forward.w_1.weight | [256, 2048] | 524288 | True 2022-12-17 13:39:37.457 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.5.feed_forward.w_1.bias | [2048] | 2048 | True 2022-12-17 13:39:37.457 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.5.feed_forward.w_2.weight | [2048, 256] | 524288 | True 2022-12-17 13:39:37.458 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.5.feed_forward.w_2.bias | [256] | 256 | True 2022-12-17 13:39:37.458 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.5.norm1.weight | [256] | 256 | True 2022-12-17 13:39:37.458 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.5.norm1.bias | [256] | 256 | True 2022-12-17 13:39:37.459 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.5.norm2.weight | [256] | 256 | True 2022-12-17 13:39:37.459 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.5.norm2.bias | [256] | 256 | True 2022-12-17 13:39:37.459 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.5.concat_linear.weight | [512, 256] | 131072 | True 2022-12-17 13:39:37.460 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.5.concat_linear.bias | [256] | 256 | True 2022-12-17 13:39:37.460 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.6.self_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.460 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.6.self_attn.linear_q.bias | [256] | 256 | True 2022-12-17 13:39:37.461 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.6.self_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.461 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.6.self_attn.linear_k.bias | [256] | 256 | True 2022-12-17 13:39:37.461 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.6.self_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.462 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.6.self_attn.linear_v.bias | [256] | 256 | True 2022-12-17 13:39:37.462 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.6.self_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.463 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.6.self_attn.linear_out.bias | [256] | 256 | True 2022-12-17 13:39:37.463 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.6.feed_forward.w_1.weight | [256, 2048] | 524288 | True 2022-12-17 13:39:37.463 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.6.feed_forward.w_1.bias | [2048] | 2048 | True 2022-12-17 13:39:37.464 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.6.feed_forward.w_2.weight | [2048, 256] | 524288 | True 2022-12-17 13:39:37.464 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.6.feed_forward.w_2.bias | [256] | 256 | True 2022-12-17 13:39:37.464 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.6.norm1.weight | [256] | 256 | True 2022-12-17 13:39:37.465 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.6.norm1.bias | [256] | 256 | True 2022-12-17 13:39:37.465 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.6.norm2.weight | [256] | 256 | True 2022-12-17 13:39:37.466 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.6.norm2.bias | [256] | 256 | True 2022-12-17 13:39:37.466 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.6.concat_linear.weight | [512, 256] | 131072 | True 2022-12-17 13:39:37.466 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.6.concat_linear.bias | [256] | 256 | True 2022-12-17 13:39:37.467 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.7.self_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.467 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.7.self_attn.linear_q.bias | [256] | 256 | True 2022-12-17 13:39:37.467 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.7.self_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.468 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.7.self_attn.linear_k.bias | [256] | 256 | True 2022-12-17 13:39:37.468 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.7.self_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.468 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.7.self_attn.linear_v.bias | [256] | 256 | True 2022-12-17 13:39:37.469 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.7.self_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.469 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.7.self_attn.linear_out.bias | [256] | 256 | True 2022-12-17 13:39:37.469 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.7.feed_forward.w_1.weight | [256, 2048] | 524288 | True 2022-12-17 13:39:37.470 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.7.feed_forward.w_1.bias | [2048] | 2048 | True 2022-12-17 13:39:37.470 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.7.feed_forward.w_2.weight | [2048, 256] | 524288 | True 2022-12-17 13:39:37.470 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.7.feed_forward.w_2.bias | [256] | 256 | True 2022-12-17 13:39:37.471 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.7.norm1.weight | [256] | 256 | True 2022-12-17 13:39:37.471 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.7.norm1.bias | [256] | 256 | True 2022-12-17 13:39:37.471 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.7.norm2.weight | [256] | 256 | True 2022-12-17 13:39:37.472 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.7.norm2.bias | [256] | 256 | True 2022-12-17 13:39:37.472 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.7.concat_linear.weight | [512, 256] | 131072 | True 2022-12-17 13:39:37.472 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.7.concat_linear.bias | [256] | 256 | True 2022-12-17 13:39:37.473 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.8.self_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.473 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.8.self_attn.linear_q.bias | [256] | 256 | True 2022-12-17 13:39:37.473 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.8.self_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.474 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.8.self_attn.linear_k.bias | [256] | 256 | True 2022-12-17 13:39:37.474 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.8.self_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.474 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.8.self_attn.linear_v.bias | [256] | 256 | True 2022-12-17 13:39:37.475 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.8.self_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.475 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.8.self_attn.linear_out.bias | [256] | 256 | True 2022-12-17 13:39:37.476 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.8.feed_forward.w_1.weight | [256, 2048] | 524288 | True 2022-12-17 13:39:37.476 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.8.feed_forward.w_1.bias | [2048] | 2048 | True 2022-12-17 13:39:37.476 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.8.feed_forward.w_2.weight | [2048, 256] | 524288 | True 2022-12-17 13:39:37.477 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.8.feed_forward.w_2.bias | [256] | 256 | True 2022-12-17 13:39:37.477 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.8.norm1.weight | [256] | 256 | True 2022-12-17 13:39:37.477 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.8.norm1.bias | [256] | 256 | True 2022-12-17 13:39:37.478 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.8.norm2.weight | [256] | 256 | True 2022-12-17 13:39:37.478 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.8.norm2.bias | [256] | 256 | True 2022-12-17 13:39:37.478 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.8.concat_linear.weight | [512, 256] | 131072 | True 2022-12-17 13:39:37.479 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.8.concat_linear.bias | [256] | 256 | True 2022-12-17 13:39:37.479 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.9.self_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.479 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.9.self_attn.linear_q.bias | [256] | 256 | True 2022-12-17 13:39:37.480 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.9.self_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.480 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.9.self_attn.linear_k.bias | [256] | 256 | True 2022-12-17 13:39:37.480 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.9.self_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.481 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.9.self_attn.linear_v.bias | [256] | 256 | True 2022-12-17 13:39:37.481 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.9.self_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.481 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.9.self_attn.linear_out.bias | [256] | 256 | True 2022-12-17 13:39:37.482 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.9.feed_forward.w_1.weight | [256, 2048] | 524288 | True 2022-12-17 13:39:37.482 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.9.feed_forward.w_1.bias | [2048] | 2048 | True 2022-12-17 13:39:37.482 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.9.feed_forward.w_2.weight | [2048, 256] | 524288 | True 2022-12-17 13:39:37.483 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.9.feed_forward.w_2.bias | [256] | 256 | True 2022-12-17 13:39:37.483 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.9.norm1.weight | [256] | 256 | True 2022-12-17 13:39:37.483 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.9.norm1.bias | [256] | 256 | True 2022-12-17 13:39:37.484 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.9.norm2.weight | [256] | 256 | True 2022-12-17 13:39:37.484 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.9.norm2.bias | [256] | 256 | True 2022-12-17 13:39:37.484 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.9.concat_linear.weight | [512, 256] | 131072 | True 2022-12-17 13:39:37.485 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.9.concat_linear.bias | [256] | 256 | True 2022-12-17 13:39:37.485 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.10.self_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.485 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.10.self_attn.linear_q.bias | [256] | 256 | True 2022-12-17 13:39:37.486 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.10.self_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.486 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.10.self_attn.linear_k.bias | [256] | 256 | True 2022-12-17 13:39:37.486 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.10.self_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.487 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.10.self_attn.linear_v.bias | [256] | 256 | True 2022-12-17 13:39:37.487 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.10.self_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.488 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.10.self_attn.linear_out.bias | [256] | 256 | True 2022-12-17 13:39:37.488 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.10.feed_forward.w_1.weight | [256, 2048] | 524288 | True 2022-12-17 13:39:37.488 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.10.feed_forward.w_1.bias | [2048] | 2048 | True 2022-12-17 13:39:37.489 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.10.feed_forward.w_2.weight | [2048, 256] | 524288 | True 2022-12-17 13:39:37.489 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.10.feed_forward.w_2.bias | [256] | 256 | True 2022-12-17 13:39:37.489 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.10.norm1.weight | [256] | 256 | True 2022-12-17 13:39:37.490 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.10.norm1.bias | [256] | 256 | True 2022-12-17 13:39:37.490 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.10.norm2.weight | [256] | 256 | True 2022-12-17 13:39:37.490 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.10.norm2.bias | [256] | 256 | True 2022-12-17 13:39:37.491 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.10.concat_linear.weight | [512, 256] | 131072 | True 2022-12-17 13:39:37.491 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.10.concat_linear.bias | [256] | 256 | True 2022-12-17 13:39:37.491 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.11.self_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.492 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.11.self_attn.linear_q.bias | [256] | 256 | True 2022-12-17 13:39:37.492 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.11.self_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.492 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.11.self_attn.linear_k.bias | [256] | 256 | True 2022-12-17 13:39:37.493 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.11.self_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.493 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.11.self_attn.linear_v.bias | [256] | 256 | True 2022-12-17 13:39:37.493 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.11.self_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.494 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.11.self_attn.linear_out.bias | [256] | 256 | True 2022-12-17 13:39:37.494 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.11.feed_forward.w_1.weight | [256, 2048] | 524288 | True 2022-12-17 13:39:37.494 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.11.feed_forward.w_1.bias | [2048] | 2048 | True 2022-12-17 13:39:37.495 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.11.feed_forward.w_2.weight | [2048, 256] | 524288 | True 2022-12-17 13:39:37.495 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.11.feed_forward.w_2.bias | [256] | 256 | True 2022-12-17 13:39:37.495 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.11.norm1.weight | [256] | 256 | True 2022-12-17 13:39:37.496 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.11.norm1.bias | [256] | 256 | True 2022-12-17 13:39:37.496 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.11.norm2.weight | [256] | 256 | True 2022-12-17 13:39:37.496 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.11.norm2.bias | [256] | 256 | True 2022-12-17 13:39:37.497 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.11.concat_linear.weight | [512, 256] | 131072 | True 2022-12-17 13:39:37.497 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - encoder.encoders.11.concat_linear.bias | [256] | 256 | True 2022-12-17 13:39:37.497 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.embed.0.weight | [4233, 256] | 1083648 | True 2022-12-17 13:39:37.498 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.after_norm.weight | [256] | 256 | True 2022-12-17 13:39:37.498 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.after_norm.bias | [256] | 256 | True 2022-12-17 13:39:37.499 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.output_layer.weight | [256, 4233] | 1083648 | True 2022-12-17 13:39:37.499 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.output_layer.bias | [4233] | 4233 | True 2022-12-17 13:39:37.499 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.self_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.500 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.self_attn.linear_q.bias | [256] | 256 | True 2022-12-17 13:39:37.500 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.self_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.500 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.self_attn.linear_k.bias | [256] | 256 | True 2022-12-17 13:39:37.501 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.self_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.501 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.self_attn.linear_v.bias | [256] | 256 | True 2022-12-17 13:39:37.501 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.self_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.502 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.self_attn.linear_out.bias | [256] | 256 | True 2022-12-17 13:39:37.502 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.src_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.502 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.src_attn.linear_q.bias | [256] | 256 | True 2022-12-17 13:39:37.503 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.src_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.503 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.src_attn.linear_k.bias | [256] | 256 | True 2022-12-17 13:39:37.504 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.src_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.504 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.src_attn.linear_v.bias | [256] | 256 | True 2022-12-17 13:39:37.504 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.src_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.505 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.src_attn.linear_out.bias | [256] | 256 | True 2022-12-17 13:39:37.505 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.feed_forward.w_1.weight | [256, 2048] | 524288 | True 2022-12-17 13:39:37.505 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.feed_forward.w_1.bias | [2048] | 2048 | True 2022-12-17 13:39:37.506 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.feed_forward.w_2.weight | [2048, 256] | 524288 | True 2022-12-17 13:39:37.506 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.feed_forward.w_2.bias | [256] | 256 | True 2022-12-17 13:39:37.506 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.norm1.weight | [256] | 256 | True 2022-12-17 13:39:37.507 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.norm1.bias | [256] | 256 | True 2022-12-17 13:39:37.507 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.norm2.weight | [256] | 256 | True 2022-12-17 13:39:37.507 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.norm2.bias | [256] | 256 | True 2022-12-17 13:39:37.508 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.norm3.weight | [256] | 256 | True 2022-12-17 13:39:37.508 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.norm3.bias | [256] | 256 | True 2022-12-17 13:39:37.508 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.concat_linear1.weight | [512, 256] | 131072 | True 2022-12-17 13:39:37.509 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.concat_linear1.bias | [256] | 256 | True 2022-12-17 13:39:37.509 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.concat_linear2.weight | [512, 256] | 131072 | True 2022-12-17 13:39:37.510 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.0.concat_linear2.bias | [256] | 256 | True 2022-12-17 13:39:37.510 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.self_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.510 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.self_attn.linear_q.bias | [256] | 256 | True 2022-12-17 13:39:37.511 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.self_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.511 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.self_attn.linear_k.bias | [256] | 256 | True 2022-12-17 13:39:37.511 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.self_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.512 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.self_attn.linear_v.bias | [256] | 256 | True 2022-12-17 13:39:37.512 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.self_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.512 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.self_attn.linear_out.bias | [256] | 256 | True 2022-12-17 13:39:37.513 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.src_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.513 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.src_attn.linear_q.bias | [256] | 256 | True 2022-12-17 13:39:37.513 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.src_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.514 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.src_attn.linear_k.bias | [256] | 256 | True 2022-12-17 13:39:37.514 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.src_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.514 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.src_attn.linear_v.bias | [256] | 256 | True 2022-12-17 13:39:37.515 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.src_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.515 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.src_attn.linear_out.bias | [256] | 256 | True 2022-12-17 13:39:37.515 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.feed_forward.w_1.weight | [256, 2048] | 524288 | True 2022-12-17 13:39:37.516 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.feed_forward.w_1.bias | [2048] | 2048 | True 2022-12-17 13:39:37.516 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.feed_forward.w_2.weight | [2048, 256] | 524288 | True 2022-12-17 13:39:37.516 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.feed_forward.w_2.bias | [256] | 256 | True 2022-12-17 13:39:37.517 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.norm1.weight | [256] | 256 | True 2022-12-17 13:39:37.518 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.norm1.bias | [256] | 256 | True 2022-12-17 13:39:37.518 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.norm2.weight | [256] | 256 | True 2022-12-17 13:39:37.518 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.norm2.bias | [256] | 256 | True 2022-12-17 13:39:37.519 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.norm3.weight | [256] | 256 | True 2022-12-17 13:39:37.519 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.norm3.bias | [256] | 256 | True 2022-12-17 13:39:37.519 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.concat_linear1.weight | [512, 256] | 131072 | True 2022-12-17 13:39:37.520 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.concat_linear1.bias | [256] | 256 | True 2022-12-17 13:39:37.520 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.concat_linear2.weight | [512, 256] | 131072 | True 2022-12-17 13:39:37.520 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.1.concat_linear2.bias | [256] | 256 | True 2022-12-17 13:39:37.521 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.self_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.521 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.self_attn.linear_q.bias | [256] | 256 | True 2022-12-17 13:39:37.522 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.self_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.522 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.self_attn.linear_k.bias | [256] | 256 | True 2022-12-17 13:39:37.522 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.self_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.523 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.self_attn.linear_v.bias | [256] | 256 | True 2022-12-17 13:39:37.523 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.self_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.523 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.self_attn.linear_out.bias | [256] | 256 | True 2022-12-17 13:39:37.524 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.src_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.524 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.src_attn.linear_q.bias | [256] | 256 | True 2022-12-17 13:39:37.524 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.src_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.525 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.src_attn.linear_k.bias | [256] | 256 | True 2022-12-17 13:39:37.525 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.src_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.525 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.src_attn.linear_v.bias | [256] | 256 | True 2022-12-17 13:39:37.525 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.src_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.526 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.src_attn.linear_out.bias | [256] | 256 | True 2022-12-17 13:39:37.526 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.feed_forward.w_1.weight | [256, 2048] | 524288 | True 2022-12-17 13:39:37.527 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.feed_forward.w_1.bias | [2048] | 2048 | True 2022-12-17 13:39:37.527 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.feed_forward.w_2.weight | [2048, 256] | 524288 | True 2022-12-17 13:39:37.527 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.feed_forward.w_2.bias | [256] | 256 | True 2022-12-17 13:39:37.528 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.norm1.weight | [256] | 256 | True 2022-12-17 13:39:37.528 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.norm1.bias | [256] | 256 | True 2022-12-17 13:39:37.528 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.norm2.weight | [256] | 256 | True 2022-12-17 13:39:37.528 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.norm2.bias | [256] | 256 | True 2022-12-17 13:39:37.529 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.norm3.weight | [256] | 256 | True 2022-12-17 13:39:37.529 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.norm3.bias | [256] | 256 | True 2022-12-17 13:39:37.530 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.concat_linear1.weight | [512, 256] | 131072 | True 2022-12-17 13:39:37.532 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.concat_linear1.bias | [256] | 256 | True 2022-12-17 13:39:37.533 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.concat_linear2.weight | [512, 256] | 131072 | True 2022-12-17 13:39:37.533 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.2.concat_linear2.bias | [256] | 256 | True 2022-12-17 13:39:37.534 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.self_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.534 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.self_attn.linear_q.bias | [256] | 256 | True 2022-12-17 13:39:37.535 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.self_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.535 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.self_attn.linear_k.bias | [256] | 256 | True 2022-12-17 13:39:37.536 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.self_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.536 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.self_attn.linear_v.bias | [256] | 256 | True 2022-12-17 13:39:37.536 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.self_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.537 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.self_attn.linear_out.bias | [256] | 256 | True 2022-12-17 13:39:37.537 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.src_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.537 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.src_attn.linear_q.bias | [256] | 256 | True 2022-12-17 13:39:37.538 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.src_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.538 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.src_attn.linear_k.bias | [256] | 256 | True 2022-12-17 13:39:37.538 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.src_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.539 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.src_attn.linear_v.bias | [256] | 256 | True 2022-12-17 13:39:37.539 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.src_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.539 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.src_attn.linear_out.bias | [256] | 256 | True 2022-12-17 13:39:37.540 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.feed_forward.w_1.weight | [256, 2048] | 524288 | True 2022-12-17 13:39:37.540 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.feed_forward.w_1.bias | [2048] | 2048 | True 2022-12-17 13:39:37.540 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.feed_forward.w_2.weight | [2048, 256] | 524288 | True 2022-12-17 13:39:37.541 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.feed_forward.w_2.bias | [256] | 256 | True 2022-12-17 13:39:37.541 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.norm1.weight | [256] | 256 | True 2022-12-17 13:39:37.541 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.norm1.bias | [256] | 256 | True 2022-12-17 13:39:37.542 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.norm2.weight | [256] | 256 | True 2022-12-17 13:39:37.542 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.norm2.bias | [256] | 256 | True 2022-12-17 13:39:37.543 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.norm3.weight | [256] | 256 | True 2022-12-17 13:39:37.543 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.norm3.bias | [256] | 256 | True 2022-12-17 13:39:37.543 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.concat_linear1.weight | [512, 256] | 131072 | True 2022-12-17 13:39:37.544 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.concat_linear1.bias | [256] | 256 | True 2022-12-17 13:39:37.544 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.concat_linear2.weight | [512, 256] | 131072 | True 2022-12-17 13:39:37.544 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.3.concat_linear2.bias | [256] | 256 | True 2022-12-17 13:39:37.545 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.self_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.545 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.self_attn.linear_q.bias | [256] | 256 | True 2022-12-17 13:39:37.545 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.self_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.546 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.self_attn.linear_k.bias | [256] | 256 | True 2022-12-17 13:39:37.546 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.self_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.546 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.self_attn.linear_v.bias | [256] | 256 | True 2022-12-17 13:39:37.547 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.self_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.547 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.self_attn.linear_out.bias | [256] | 256 | True 2022-12-17 13:39:37.547 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.src_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.548 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.src_attn.linear_q.bias | [256] | 256 | True 2022-12-17 13:39:37.548 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.src_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.549 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.src_attn.linear_k.bias | [256] | 256 | True 2022-12-17 13:39:37.549 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.src_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.550 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.src_attn.linear_v.bias | [256] | 256 | True 2022-12-17 13:39:37.550 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.src_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.551 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.src_attn.linear_out.bias | [256] | 256 | True 2022-12-17 13:39:37.551 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.feed_forward.w_1.weight | [256, 2048] | 524288 | True 2022-12-17 13:39:37.551 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.feed_forward.w_1.bias | [2048] | 2048 | True 2022-12-17 13:39:37.553 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.feed_forward.w_2.weight | [2048, 256] | 524288 | True 2022-12-17 13:39:37.554 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.feed_forward.w_2.bias | [256] | 256 | True 2022-12-17 13:39:37.554 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.norm1.weight | [256] | 256 | True 2022-12-17 13:39:37.554 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.norm1.bias | [256] | 256 | True 2022-12-17 13:39:37.555 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.norm2.weight | [256] | 256 | True 2022-12-17 13:39:37.555 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.norm2.bias | [256] | 256 | True 2022-12-17 13:39:37.555 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.norm3.weight | [256] | 256 | True 2022-12-17 13:39:37.556 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.norm3.bias | [256] | 256 | True 2022-12-17 13:39:37.556 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.concat_linear1.weight | [512, 256] | 131072 | True 2022-12-17 13:39:37.556 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.concat_linear1.bias | [256] | 256 | True 2022-12-17 13:39:37.557 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.concat_linear2.weight | [512, 256] | 131072 | True 2022-12-17 13:39:37.557 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.4.concat_linear2.bias | [256] | 256 | True 2022-12-17 13:39:37.558 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.self_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.558 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.self_attn.linear_q.bias | [256] | 256 | True 2022-12-17 13:39:37.558 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.self_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.559 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.self_attn.linear_k.bias | [256] | 256 | True 2022-12-17 13:39:37.559 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.self_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.559 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.self_attn.linear_v.bias | [256] | 256 | True 2022-12-17 13:39:37.560 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.self_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.560 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.self_attn.linear_out.bias | [256] | 256 | True 2022-12-17 13:39:37.560 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.src_attn.linear_q.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.561 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.src_attn.linear_q.bias | [256] | 256 | True 2022-12-17 13:39:37.561 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.src_attn.linear_k.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.561 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.src_attn.linear_k.bias | [256] | 256 | True 2022-12-17 13:39:37.562 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.src_attn.linear_v.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.562 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.src_attn.linear_v.bias | [256] | 256 | True 2022-12-17 13:39:37.562 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.src_attn.linear_out.weight | [256, 256] | 65536 | True 2022-12-17 13:39:37.563 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.src_attn.linear_out.bias | [256] | 256 | True 2022-12-17 13:39:37.563 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.feed_forward.w_1.weight | [256, 2048] | 524288 | True 2022-12-17 13:39:37.563 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.feed_forward.w_1.bias | [2048] | 2048 | True 2022-12-17 13:39:37.564 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.feed_forward.w_2.weight | [2048, 256] | 524288 | True 2022-12-17 13:39:37.564 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.feed_forward.w_2.bias | [256] | 256 | True 2022-12-17 13:39:37.564 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.norm1.weight | [256] | 256 | True 2022-12-17 13:39:37.565 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.norm1.bias | [256] | 256 | True 2022-12-17 13:39:37.565 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.norm2.weight | [256] | 256 | True 2022-12-17 13:39:37.566 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.norm2.bias | [256] | 256 | True 2022-12-17 13:39:37.566 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.norm3.weight | [256] | 256 | True 2022-12-17 13:39:37.567 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.norm3.bias | [256] | 256 | True 2022-12-17 13:39:37.568 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.concat_linear1.weight | [512, 256] | 131072 | True 2022-12-17 13:39:37.568 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.concat_linear1.bias | [256] | 256 | True 2022-12-17 13:39:37.568 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.concat_linear2.weight | [512, 256] | 131072 | True 2022-12-17 13:39:37.569 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - decoder.decoders.5.concat_linear2.bias | [256] | 256 | True 2022-12-17 13:39:37.569 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - ctc.ctc_lo.weight | [256, 4233] | 1083648 | True 2022-12-17 13:39:37.569 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:57 - ctc.ctc_lo.bias | [4233] | 4233 | True 2022-12-17 13:39:37.570 | INFO | paddlespeech.s2t.utils.layer_tools:print_params:60 - Total parameters: 411.0, 31.95M elements. 2022-12-17 13:39:37.570 | INFO | paddlespeech.s2t.exps.u2.model:setup_model:262 - Setup model! 2022-12-17 13:39:37.571 | INFO | paddlespeech.s2t.utils.dynamic_import:instance_class:68 - Instance: WarmupLR {'learning_rate': 0.002, 'verbose': False, 'warmup_steps': 25000}. 2022-12-17 13:39:37.604 | INFO | paddlespeech.s2t.training.optimizer:from_args:109 - <WeightDecay - L2Decay, regularization_coeff=0.000001> 2022-12-17 13:39:37.605 | INFO | paddlespeech.s2t.training.optimizer:from_args:111 - <GradClip - Gradient Clip By GlobalNorm, global_norm=5.000000> 2022-12-17 13:39:37.605 | INFO | paddlespeech.s2t.utils.dynamic_import:instance_class:68 - Instance: Adam {'grad_clip': ClipGradByGlobalNormWithLog(global_clip_norm=5.0), 'weight_decay': <paddle.regularizer.L2Decay object at 0x7f578b8cf0b8>, 'learning_rate': WarmupLR(warmup_steps=25000, lr=0.002, last_epoch=0)}. 2022-12-17 13:39:37.606 | INFO | paddlespeech.s2t.training.optimizer:from_args:120 - LR: WarmupLR(warmup_steps=25000, lr=0.002, last_epoch=0) 2022-12-17 13:39:37.606 | INFO | paddlespeech.s2t.exps.u2.model:setup_model:308 - Setup optimizer/lr_scheduler! 2022-12-17 13:39:37.607 | INFO | paddlespeech.s2t.training.trainer:resume_or_scratch:221 - Init from scratch! 2022-12-17 13:39:37.792 | INFO | paddlespeech.s2t.utils.checkpoint:_save_parameters:286 - Saved model to exp/transformer/checkpoints/init.pdparams 2022-12-17 13:39:37.793 | INFO | paddlespeech.s2t.utils.checkpoint:_save_parameters:292 - Saved optimzier state to exp/transformer/checkpoints/init.pdopt 2022-12-17 13:39:37.793 | INFO | paddlespeech.s2t.exps.u2.model:do_train:161 - Train Total Examples: 30025

/home/navy/PaddleSpeech/paddlespeech/audio/transform/spec_augment.py:49: DeprecationWarning: BICUBIC is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BICUBIC instead. Image.BICUBIC) /home/navy/PaddleSpeech/paddlespeech/audio/transform/spec_augment.py:51: DeprecationWarning: BICUBIC is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BICUBIC instead. Image.BICUBIC) /home/navy/PaddleSpeech/paddlespeech/audio/transform/spec_augment.py:49: DeprecationWarning: BICUBIC is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BICUBIC instead. Image.BICUBIC) /home/navy/PaddleSpeech/paddlespeech/audio/transform/spec_augment.py:51: DeprecationWarning: BICUBIC is deprecated and will be removed in Pillow 10 (2023-07-01). Use Resampling.BICUBIC instead. Image.BICUBIC)


C++ Traceback (most recent call last):

0 arange_ad_func(paddle::experimental::Tensor const&, paddle::experimental::Tensor const&, paddle::experimental::Tensor const&, paddle::experimental::DataType, phi::Place) 1 paddle::experimental::arange(paddle::experimental::Tensor const&, paddle::experimental::Tensor const&, paddle::experimental::Tensor const&, paddle::experimental::DataType, phi::Place const&) 2 void phi::ArangeKernel<long, phi::GPUContext>(phi::GPUContext const&, phi::DenseTensor const&, phi::DenseTensor const&, phi::DenseTensor const&, phi::DenseTensor*)


Error Message Summary:

FatalError: Erroneous arithmetic operation is detected by the operating system. [TimeInfo: Aborted at 1671255578 (unix time) try "date -d @1671255578" if you are using GNU date ] [SignalInfo: SIGFPE (@0x7f5828a0d16d) received by PID 16481 (TID 0x7f58a059b700) from PID 681628013 ]

LAUNCH INFO 2022-12-17 13:39:38,457 Pod failed LAUNCH ERROR 2022-12-17 13:39:38,457 Container failed !!! Container rank 0 status failed cmd ['/home/navy/miniconda3/envs/new/bin/python3', '-u', '/home/navy/PaddleSpeech/paddlespeech/s2t/exps/u2/bin/train.py', '--ngpu', '1', '--seed', '0', '--config', 'conf/transformer.yaml', '--output', 'exp/transformer', '--profiler-options', '', '--benchmark-batch-size', '0', '--benchmark-max-step', '0'] code -8 log log/workerlog.0 env {'LC_PAPER': 'lzh_TW.UTF-8', 'XDG_VTNR': '7', 'LC_ADDRESS': 'lzh_TW.UTF-8', 'XDG_SESSION_ID': 'c2', 'TERM_PROGRAM': 'vscode', 'LC_MONETARY': 'lzh_TW.UTF-8', 'XDG_GREETER_DATA_DIR': '/var/lib/lightdm-data/navy', 'CLUTTER_IM_MODULE': 'xim', 'GIO_LAUNCHED_DESKTOP_FILE_PID': '15702', 'SESSION': 'ubuntu', 'GPG_AGENT_INFO': '/home/navy/.gnupg/S.gpg-agent:0:1', 'TERM': 'xterm-256color', 'XDG_MENU_PREFIX': 'gnome-', 'SHELL': '/bin/bash', 'CONDA_SHLVL': '2', 'QT_LINUX_ACCESSIBILITY_ALWAYS_ON': '1', 'TERM_PROGRAM_VERSION': '1.74.0', 'CONDA_PROMPT_MODIFIER': '(new) ', 'LC_NUMERIC': 'lzh_TW.UTF-8', 'ORIGINAL_XDG_CURRENT_DESKTOP': 'Unity', 'UPSTART_SESSION': 'unix:abstract=/com/ubuntu/upstart-session/1000/2898', 'GTK_MODULES': 'gail:atk-bridge:unity-gtk-module', 'LC_ALL': 'C', 'USER': 'navy', 'PYTHONIOENCODING': 'UTF-8', 'SRILM': '/home/navy/PaddleSpeech/tools/srilm', 'LS_COLORS': 'rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:mi=00:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:.tar=01;31:.tgz=01;31:.arc=01;31:.arj=01;31:.taz=01;31:.lha=01;31:.lz4=01;31:.lzh=01;31:.lzma=01;31:.tlz=01;31:.txz=01;31:.tzo=01;31:.t7z=01;31:.zip=01;31:.z=01;31:.Z=01;31:.dz=01;31:.gz=01;31:.lrz=01;31:.lz=01;31:.lzo=01;31:.xz=01;31:.bz2=01;31:.bz=01;31:.tbz=01;31:.tbz2=01;31:.tz=01;31:.deb=01;31:.rpm=01;31:.jar=01;31:.war=01;31:.ear=01;31:.sar=01;31:.rar=01;31:.alz=01;31:.ace=01;31:.zoo=01;31:.cpio=01;31:.7z=01;31:.rz=01;31:.cab=01;31:.jpg=01;35:.jpeg=01;35:.gif=01;35:.bmp=01;35:.pbm=01;35:.pgm=01;35:.ppm=01;35:.tga=01;35:.xbm=01;35:.xpm=01;35:.tif=01;35:.tiff=01;35:.png=01;35:.svg=01;35:.svgz=01;35:.mng=01;35:.pcx=01;35:.mov=01;35:.mpg=01;35:.mpeg=01;35:.m2v=01;35:.mkv=01;35:.webm=01;35:.ogm=01;35:.mp4=01;35:.m4v=01;35:.mp4v=01;35:.vob=01;35:.qt=01;35:.nuv=01;35:.wmv=01;35:.asf=01;35:.rm=01;35:.rmvb=01;35:.flc=01;35:.avi=01;35:.fli=01;35:.flv=01;35:.gl=01;35:.dl=01;35:.xcf=01;35:.xwd=01;35:.yuv=01;35:.cgm=01;35:.emf=01;35:.ogv=01;35:.ogx=01;35:.aac=00;36:.au=00;36:.flac=00;36:.m4a=00;36:.mid=00;36:.midi=00;36:.mka=00;36:.mp3=00;36:.mpc=00;36:.ogg=00;36:.ra=00;36:.wav=00;36:.oga=00;36:.opus=00;36:.spx=00;36:.xspf=00;36:', 'LD_LIBRARY_PATH': '/usr/local/cuda-10.2/lib64:/usr/local/cuda-10.2/lib64::/usr/local/lib/:/home/navy/PaddleSpeech/tools/liblbfgs-1.10/lib/.libs', 'LC_TELEPHONE': 'lzh_TW.UTF-8', 'QT_ACCESSIBILITY': '1', 'BIN_DIR': '/home/navy/PaddleSpeech/paddlespeech/s2t/exps/u2/bin', 'CONDA_EXE': '/home/navy/miniconda3/bin/conda', 'UNITY_HAS_3D_SUPPORT': 'false', 'XDG_SESSION_PATH': '/org/freedesktop/DisplayManager/Session0', 'XDG_SEAT_PATH': '/org/freedesktop/DisplayManager/Seat0', 'LIBLBFGS': '/home/navy/PaddleSpeech/tools/liblbfgs-1.10', 'SSH_AUTH_SOCK': '/run/user/1000/keyring/ssh', 'MAIN_ROOT': '/home/navy/PaddleSpeech', 'SESSION_MANAGER': 'local/navy:@/tmp/.ICE-unix/3144,unix/navy:/tmp/.ICE-unix/3144', 'DEFAULTS_PATH': '/usr/share/gconf/ubuntu.default.path', 'FLAGS_allocator_strategy': 'naive_best_fit', 'GIO_LAUNCHED_DESKTOP_FILE': '/usr/share/applications/code.desktop', '_CE_CONDA': '', 'UNITY_DEFAULT_PROFILE': 'unity-lowgfx', 'XDG_CONFIG_DIRS': '/etc/xdg/xdg-ubuntu:/usr/share/upstart/xdg:/etc/xdg', 'CONDA_PREFIX_1': '/home/navy/miniconda3', 'DESKTOP_SESSION': 'ubuntu', 'PATH': '/home/navy/PaddleSpeech/examples/aishell/asr1/utils/:/home/navy/PaddleSpeech/tools/kaldi/tools/openfst/bin:/home/navy/PaddleSpeech/examples/aishell/asr1:/home/navy/PaddleSpeech:/home/navy/PaddleSpeech/utils:/usr/local/cuda-10.2/bin:/home/navy/bin:/home/navy/.local/bin:/usr/local/cuda-10.2/bin:/home/navy/miniconda3/envs/new/bin:/home/navy/miniconda3/condabin:/home/navy/bin:/home/navy/.local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/home/navy/PaddleSpeech/tools/srilm/bin:/home/navy/PaddleSpeech/tools/srilm/bin/i686-m64', 'QT_QPA_PLATFORMTHEME': 'appmenu-qt5', 'QT_IM_MODULE': 'fcitx', 'CONDA_PREFIX': '/home/navy/miniconda3/envs/new', 'LC_IDENTIFICATION': 'lzh_TW.UTF-8', 'JOB': 'unity-settings-daemon', 'XDG_SESSION_TYPE': 'x11', 'PWD': '/home/navy/PaddleSpeech/examples/aishell/asr1', 'XMODIFIERS': '@im=fcitx', 'CUDA_VISIBLE_DEVICES': '0', 'LANG': 'en_US.UTF-8', 'GDK_BACKEND': 'x11', 'GDM_LANG': 'en_US', 'MANDATORY_PATH': '/usr/share/gconf/ubuntu.mandatory.path', 'LC_MEASUREMENT': 'lzh_TW.UTF-8', 'VSCODE_GIT_ASKPASS_EXTRA_ARGS': '--ms-enable-electron-run-as-node', 'CHROME_DESKTOP': 'code-url-handler.desktop', 'COMPIZ_CONFIG_PROFILE': 'ubuntu', 'IM_CONFIG_PHASE': '1', 'PAPERSIZE': 'a4', 'PYTHONDONTWRITEBYTECODE': '1', 'GDMSESSION': 'ubuntu', '_CE_M': '', 'GTK2_MODULES': 'overlay-scrollbar', 'SESSIONTYPE': 'gnome-session', 'HOME': '/home/navy', 'XDG_SEAT': 'seat0', 'SHLVL': '4', 'VSCODE_GIT_ASKPASS_MAIN': '/usr/share/code/resources/app/extensions/git/dist/askpass-main.js', 'LANGUAGE': 'en_US', 'GNOME_DESKTOP_SESSION_ID': 'this-is-deprecated', 'LIBGL_ALWAYS_SOFTWARE': '1', 'CONDA_PYTHON_EXE': '/home/navy/miniconda3/bin/python', 'UPSTART_EVENTS': 'xsession started', 'LOGNAME': 'navy', 'XDG_SESSION_DESKTOP': 'ubuntu', 'PYTHONPATH': '/home/navy/PaddleSpeech:', 'COMPIZ_BIN_PATH': '/usr/bin/', 'KALDI_ROOT': '/home/navy/PaddleSpeech/tools/kaldi', 'VSCODE_GIT_IPC_HANDLE': '/run/user/1000/vscode-git-ab56deb792.sock', 'DBUS_SESSION_BUS_ADDRESS': 'unix:abstract=/tmp/dbus-mQtOo5MlPr', 'XDG_DATA_DIRS': '/usr/share/ubuntu:/usr/share/gnome:/usr/local/share:/usr/share:/var/lib/snapd/desktop', 'QT4_IM_MODULE': 'fcitx', 'LESSOPEN': '| /usr/bin/lesspipe %s', 'CONDA_DEFAULT_ENV': 'new', 'VSCODE_GIT_ASKPASS_NODE': '/usr/share/code/code', 'GIT_ASKPASS': '/usr/share/code/resources/app/extensions/git/dist/askpass.sh', 'UPSTART_JOB': 'unity7', 'DISPLAY': ':0', 'XDG_RUNTIME_DIR': '/run/user/1000', 'XDG_CURRENT_DESKTOP': 'Unity', 'GTK_IM_MODULE': 'fcitx', 'LESSCLOSE': '/usr/bin/lesspipe %s %s', 'LC_TIME': 'lzh_TW.UTF-8', 'COLORTERM': 'truecolor', 'LC_NAME': 'lzhTW.UTF-8', 'XAUTHORITY': '/home/navy/.Xauthority', '': '/home/navy/miniconda3/envs/new/bin/python3', 'CUSTOM_DEVICE_ROOT': '', 'OMP_NUM_THREADS': '1', 'POD_NAME': 'pvzbdt', 'PADDLE_MASTER': '127.0.1.1:47147', 'PADDLE_GLOBAL_SIZE': '1', 'PADDLE_LOCAL_SIZE': '1', 'PADDLE_GLOBAL_RANK': '0', 'PADDLE_LOCAL_RANK': '0', 'PADDLE_NNODES': '1', 'PADDLE_TRAINER_ENDPOINTS': '127.0.1.1:47148', 'PADDLE_CURRENT_ENDPOINT': '127.0.1.1:47148', 'PADDLE_TRAINER_ID': '0', 'PADDLE_TRAINERS_NUM': '1', 'PADDLE_RANK_IN_NODE': '0', 'FLAGS_selected_gpus': '0'}

zxcd commented 1 year ago

这应该是个架构问题,麻烦可以把issue提到: https://github.com/PaddlePaddle/Paddle/issues

stale[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] commented 1 year ago

This issue is closed. Please re-open if needed.