export 'icefall-asr-multi-zh-hans-zipformer-large' to onnx successfully but deploy failed

Hello, there. i want to export the 'pretrained.pt' from https://huggingface.co/yuekai/icefall-asr-multi-zh-hans-zipformer-large/tree/main to onnx.

so I run: ./zipformer/export-onnx-streaming.py --tokens data/lang_bpe_2000/tokens.txt --use-averaged-model 0 --epoch 9999 --avg 1 --exp-dir zipformer/exp20241202 --num-encoder-layers "2,2,4,5,4,2" --downsampling-factor "1,2,4,8,4,2" --feedforward-dim "768,1024,1536,2048,1536,768" --num-heads "4,4,4,8,4,4" --encoder-dim "256,384,512,768,512,256" --query-head-dim 32 --pos-head-dim 4 --pos-dim 48 --encoder-unmasked-dim "192,192,256,320,256,192" --cnn-module-kernel "31,31,15,15,15,31" --decoder-dim 512 --joiner-dim 512 --causal True --chunk-size 16 --left-context-frames 128 at this root: /home/jml/audio_icefall/icefall/egs/multi_zh-hans/ASR。

the export log is： I only pasted part of log due to word limit. Please see the attachment for the complete log ./zipformer/export-onnx-streaming.py --tokens data/lang_bpe_2000/tokens.txt --use-averaged-model 0 --epoch 9999 --avg 1 --exp-dir zipformer/exp20241202 --num-encoder-layers "2,2,4,5,4,2" --downsampling-factor "1,2,4,8,4,2" --feedforward-dim "768,1024,1536,2048,1536,768" --num-heads "4,4,4,8,4,4" --encoder-dim "256,384,512,768,512,256" --query-head-dim 32 --pos-head-dim 4 --pos-dim 48 --encoder-unmasked-dim "192,192,256,320,256,192" --cnn-module-kernel "31,31,15,15,15,31" --decoder-dim 512 --joiner-dim 512 --causal True --chunk-size 16 --left-context-frames 128 2024-12-03 13:57:02,853 INFO [export-onnx-streaming.py:596] device: cuda:0 2024-12-03 13:57:02,861 INFO [export-onnx-streaming.py:602] { "attention_decoder_attention_dim": 512, "attention_decoder_dim": 512, "attention_decoder_feedforward_dim": 2048, "attention_decoder_num_heads": 8, "attention_decoder_num_layers": 6, "avg": 1, "batch_idx_train": 0, "best_train_epoch": -1, "best_train_loss": Infinity, "best_valid_epoch": -1, "best_valid_loss": Infinity, "blank_id": 0, "causal": true, "chunk_size": "16", "cnn_module_kernel": "31,31,15,15,15,31", "context_size": 2, "decoder_dim": 512, "downsampling_factor": "1,2,4,8,4,2", "encoder_dim": "256,384,512,768,512,256", "encoder_unmasked_dim": "192,192,256,320,256,192", "env_info": { "IP address": "172.17.0.4", "hostname": "4b769e7f3c00", "icefall-git-branch": "master", "icefall-git-date": "Fri Nov 1 22:49:19 2024", "icefall-git-sha1": "57451b03-dirty", "icefall-path": "/tmp/icefall", "k2-build-type": "Release", "k2-git-date": "Tue Oct 29 09:02:19 2024", "k2-git-sha1": "75e2ed6b2fd87c22b7f3f34bad48a69984bb8755", "k2-path": "/home/jml/anaconda/envs/zipformer/lib/python3.9/site-packages/k2/init.py", "k2-version": "1.24.4", "k2-with-cuda": true, "lhotse-path": "/home/jml/anaconda/envs/zipformer/lib/python3.9/site-packages/lhotse/init.py", "lhotse-version": "1.27.0", "python-version": "3.9", "torch-cuda-available": true, "torch-cuda-version": "11.8", "torch-version": "2.1.0+cu118" }, "epoch": 9999, "exp_dir": "zipformer/exp20241202", "feature_dim": 80, "feedforward_dim": "768,1024,1536,2048,1536,768", "fp16": false, "ignore_id": -1, "iter": 0, "joiner_dim": 512, "label_smoothing": 0.1, "left_context_frames": "128", "log_interval": 50, "num_encoder_layers": "2,2,4,5,4,2", "num_heads": "4,4,4,8,4,4", "pos_dim": 48, "pos_head_dim": "4", "query_head_dim": "32", "reset_interval": 200, "subsampling_factor": 4, "tokens": "data/lang_bpe_2000/tokens.txt", "use_attention_decoder": false, "use_averaged_model": false, "use_cr_ctc": false, "use_ctc": false, "use_transducer": true, "valid_interval": 3000, "value_head_dim": "12", "vocab_size": 2000, "warm_step": 2000 } 2024-12-03 13:57:02,862 INFO [export-onnx-streaming.py:604] About to create model 2024-12-03 13:57:04,298 INFO [checkpoint.py:112] Loading checkpoint from zipformer/exp20241202/epoch-9999.pt 2024-12-03 13:57:05,153 INFO [export-onnx-streaming.py:714] encoder parameters: 153819802 2024-12-03 13:57:05,154 INFO [export-onnx-streaming.py:715] decoder parameters: 1290752 2024-12-03 13:57:05,154 INFO [export-onnx-streaming.py:716] joiner parameters: 1026000 2024-12-03 13:57:05,154 INFO [export-onnx-streaming.py:717] total parameters: 156136554 2024-12-03 13:57:05,154 INFO [export-onnx-streaming.py:730] Exporting encoder 2024-12-03 13:57:05,155 INFO [export-onnx-streaming.py:358] num_encoders: 6 2024-12-03 13:57:05,155 INFO [export-onnx-streaming.py:359] len(init_state): 116 2024-12-03 13:57:05,155 INFO [export-onnx-streaming.py:444] meta_data: {'model_type': 'zipformer2', 'version': '1', 'model_author': 'k2-fsa', 'comment': 'streaming zipformer2', 'decode_chunk_len': '32', 'T': '45', 'num_encoder_layers': '2,2,4,5,4,2', 'encoder_dims': '256,384,512,768,512,256', 'cnn_module_kernels': '31,31,15,15,15,31', 'left_context_len': '128,64,32,16,32,64', 'query_head_dims': '32,32,32,32,32,32', 'value_head_dims': '12,12,12,12,12,12', 'num_heads': '4,4,4,8,4,4'} 2024-12-03 13:57:05,155 INFO [export-onnx-streaming.py:372] cached_key_0.shape: torch.Size([128, 1, 128]) 2024-12-03 13:57:05,155 INFO [export-onnx-streaming.py:380] cached_nonlin_attn_0.shape: torch.Size([1, 1, 128, 192]) 2024-12-03 13:57:05,156 INFO [export-onnx-streaming.py:388] cached_val1_0.shape: torch.Size([128, 1, 48]) 2024-12-03 13:57:05,156 INFO [export-onnx-streaming.py:396] cached_val2_0.shape: torch.Size([128, 1, 48]) 2024-12-03 13:57:05,156 INFO [export-onnx-streaming.py:404] cached_conv1_0.shape: torch.Size([1, 256, 15]) 2024-12-03 13:57:05,156 INFO [export-onnx-streaming.py:412] cached_conv2_0.shape: torch.Size([1, 256, 15]) 2024-12-03 13:57:05,156 INFO [export-onnx-streaming.py:372] cached_key_1.shape: torch.Size([128, 1, 128]) 2024-12-03 13:57:05,156 INFO [export-onnx-streaming.py:380] cached_nonlin_attn_1.shape: torch.Size([1, 1, 128, 192]) 2024-12-03 13:57:05,156 INFO [export-onnx-streaming.py:388] cached_val1_1.shape: torch.Size([128, 1, 48]) 2024-12-03 13:57:05,156 INFO [export-onnx-streaming.py:396] cached_val2_1.shape: torch.Size([128, 1, 48]) 2024-12-03 13:57:05,156 INFO [export-onnx-streaming.py:404] cached_conv1_1.shape: torch.Size([1, 256, 15]) 2024-12-03 13:57:05,156 INFO [export-onnx-streaming.py:412] cached_conv2_1.shape: torch.Size([1, 256, 15]) 2024-12-03 13:57:05,156 INFO [export-onnx-streaming.py:372] cached_key_2.shape: torch.Size([64, 1, 128]) 2024-12-03 13:57:05,156 INFO [export-onnx-streaming.py:380] cached_nonlin_attn_2.shape: torch.Size([1, 1, 64, 288]) 2024-12-03 13:57:05,156 INFO [export-onnx-streaming.py:388] cached_val1_2.shape: torch.Size([64, 1, 48]) 2024-12-03 13:57:05,156 INFO [export-onnx-streaming.py:396] cached_val2_2.shape: torch.Size([64, 1, 48]) 2024-12-03 13:57:05,156 INFO [export-onnx-streaming.py:404] cached_conv1_2.shape: torch.Size([1, 384, 15]) 2024-12-03 13:57:05,156 INFO [export-onnx-streaming.py:412] cached_conv2_2.shape: torch.Size([1, 384, 15]) 2024-12-03 13:57:05,156 INFO [export-onnx-streaming.py:372] cached_key_3.shape: torch.Size([64, 1, 128]) 2024-12-03 13:57:05,156 INFO [export-onnx-streaming.py:380] cached_nonlin_attn_3.shape: torch.Size([1, 1, 64, 288]) 2024-12-03 13:57:05,156 INFO [export-onnx-streaming.py:388] cached_val1_3.shape: torch.Size([64, 1, 48]) 2024-12-03 13:57:05,156 INFO [export-onnx-streaming.py:396] cached_val2_3.shape: torch.Size([64, 1, 48]) 2024-12-03 13:57:05,156 INFO [export-onnx-streaming.py:404] cached_conv1_3.shape: torch.Size([1, 384, 15]) 2024-12-03 13:57:05,156 INFO [export-onnx-streaming.py:412] cached_conv2_3.shape: torch.Size([1, 384, 15]) 2024-12-03 13:57:05,156 INFO [export-onnx-streaming.py:372] cached_key_4.shape: torch.Size([32, 1, 128]) 2024-12-03 13:57:05,156 INFO [export-onnx-streaming.py:380] cached_nonlin_attn_4.shape: torch.Size([1, 1, 32, 384]) 2024-12-03 13:57:05,156 INFO [export-onnx-streaming.py:388] cached_val1_4.shape: torch.Size([32, 1, 48]) 2024-12-03 13:57:05,156 INFO [export-onnx-streaming.py:396] cached_val2_4.shape: torch.Size([32, 1, 48]) 2024-12-03 13:57:05,156 INFO [export-onnx-streaming.py:404] cached_conv1_4.shape: torch.Size([1, 512, 7]) 2024-12-03 13:57:05,156 INFO [export-onnx-streaming.py:412] cached_conv2_4.shape: torch.Size([1, 512, 7]) 2024-12-03 13:57:05,157 INFO [export-onnx-streaming.py:372] cached_key_5.shape: torch.Size([32, 1, 128]) 2024-12-03 13:57:05,157 INFO [export-onnx-streaming.py:380] cached_nonlin_attn_5.shape: torch.Size([1, 1, 32, 384]) 2024-12-03 13:57:05,157 INFO [export-onnx-streaming.py:388] cached_val1_5.shape: torch.Size([32, 1, 48]) 2024-12-03 13:57:05,157 INFO [export-onnx-streaming.py:396] cached_val2_5.shape: torch.Size([32, 1, 48]) 2024-12-03 13:57:05,157 INFO [export-onnx-streaming.py:404] cached_conv1_5.shape: torch.Size([1, 512, 7]) 2024-12-03 13:57:05,157 INFO [export-onnx-streaming.py:412] cached_conv2_5.shape: torch.Size([1, 512, 7]) 2024-12-03 13:57:05,157 INFO [export-onnx-streaming.py:372] cached_key_6.shape: torch.Size([32, 1, 128]) 2024-12-03 13:57:05,157 INFO [export-onnx-streaming.py:380] cached_nonlin_attn_6.shape: torch.Size([1, 1, 32, 384]) 2024-12-03 13:57:05,157 INFO [export-onnx-streaming.py:388] cached_val1_6.shape: torch.Size([32, 1, 48]) 2024-12-03 13:57:05,157 INFO [export-onnx-streaming.py:396] cached_val2_6.shape: torch.Size([32, 1, 48]) 2024-12-03 13:57:05,157 INFO [export-onnx-streaming.py:404] cached_conv1_6.shape: torch.Size([1, 512, 7]) 2024-12-03 13:57:05,157 INFO [export-onnx-streaming.py:412] cached_conv2_6.shape: torch.Size([1, 512, 7]) 2024-12-03 13:57:05,159 INFO [export-onnx-streaming.py:412] cached_conv2_18.shape: torch.Size([1, 256, 15]) 2024-12-03 13:57:05,159 INFO [export-onnx-streaming.py:452] embed_states.shape: torch.Size([1, 128, 3, 19]) 2024-12-03 13:57:05,159 INFO [export-onnx-streaming.py:461] processed_lens.shape: torch.Size([1]) 2024-12-03 13:57:05,160 INFO [export-onnx-streaming.py:467] {'cached_key_0': {1: 'N'}, 'cached_nonlin_attn_0': {1: 'N'}, 'cached_val1_0': {1: 'N'}, 'cached_val2_0': {1: 'N'}, 'cached_conv1_0': {0: 'N'}, 'cached_conv2_0': {0: 'N'}, 'cached_key_1': {1: 'N'}, 'cached_nonlin_attn_1': {1: 'N'}, 'cached_val1_1': {1: 'N'}, 'cached_val2_1': {1: 'N'}, 'cached_conv1_1': {0: 'N'}, 'cached_conv2_1': {0: 'N'}, 'cached_key_2': {1: 'N'}, 'cached_nonlin_attn_2': {1: 'N'}, 'cached_val1_2': {1: 'N'}, 'cached_val2_2': {1: 'N'}, 'cached_conv1_2': {0: 'N'}, 'cached_conv2_2': {0: 'N'}, 'cached_key_3': {1: 'N'}, 'cached_nonlin_attn_3': {1: 'N'}, 'cached_val1_3': {1: 'N'}, 'cached_val2_3': {1: 'N'}, 'cached_conv1_3': {0: 'N'}, 'cached_conv2_3': {0: 'N'}, 'cached_key_4': {1: 'N'}, 'cached_nonlin_attn_4': {1: 'N'}, 'cached_val1_4': {1: 'N'}, 'cached_val2_4': {1: 'N'}, 'cached_conv1_4': {0: 'N'}, 'cached_conv2_4': {0: 'N'}, 'cached_key_5': {1: 'N'}, 'cached_nonlin_attn_5': {1: 'N'}, 'cached_val1_5': {1: 'N'}, 'cached_val2_5': {1: 'N'}, 'cached_conv1_5': {0: 'N'}, 'cached_conv2_5': {0: 'N'}, 'cached_key_6': {1: 'N'}, 'cached_nonlin_attn_6': {1: 'N'}, 'cached_val1_6': {1: 'N'}, 'cached_val2_6': {1: 'N'}, 'cached_conv1_6': {0: 'N'}, 'cached_conv2_6': {0: 'N'}, 'cached_key_7': {1: 'N'}, 'cached_nonlin_attn_7': {1: 'N'}, 'cached_val1_7': {1: 'N'}, 'cached_val2_7': {1: 'N'}, 'cached_conv1_7': {0: 'N'}, 'cached_conv2_7': {0: 'N'}, 'cached_key_8': {1: 'N'}, 'cached_nonlin_attn_8': {1: 'N'}, 'cached_val1_8': {1: 'N'}, 'cached_val2_8': {1: 'N'}, 'cached_conv1_8': {0: 'N'}, 'cached_conv2_8': {0: 'N'}, 'cached_key_9': {1: 'N'}, 'cached_nonlin_attn_9': {1: 'N'}, 'cached_val1_9': {1: 'N'}, 'cached_val2_9': {1: 'N'}, 'cached_conv1_9': {0: 'N'}, 'cached_conv2_9': {0: 'N'}, 'cached_key_10': {1: 'N'}, 'cached_nonlin_attn_10': {1: 'N'}, 'cached_val1_10': {1: 'N'}, 'cached_val2_10': {1: 'N'}, 'cached_conv1_10': {0: 'N'}, 'cached_conv2_10': {0: 'N'}, 'cached_key_11': {1: 'N'}, 'cached_nonlin_attn_11': {1: 'N'}, 'cached_val1_11': {1: 'N'}, 'cached_val2_11': {1: 'N'}, 'cached_conv1_11': {0: 'N'}, 'cached_conv2_11': {0: 'N'}, 'cached_key_12': {1: 'N'}, 'cached_nonlin_attn_12': {1: 'N'}, 'cached_val1_12': {1: 'N'}, 'cached_val2_12': {1: 'N'}, 'cached_conv1_12': {0: 'N'}, 'cached_conv2_12': {0: 'N'}, 'cached_key_13': {1: 'N'}, 'cached_nonlin_attn_13': {1: 'N'}, 'cached_val1_13': {1: 'N'}, 'cached_val2_13': {1: 'N'}, 'cached_conv1_13': {0: 'N'}, 'cached_conv2_13': {0: 'N'}, 'cached_key_14': {1: 'N'}, 'cached_nonlin_attn_14': {1: 'N'}, 'cached_val1_14': {1: 'N'}, 'cached_val2_14': {1: 'N'}, 'cached_conv1_14': {0: 'N'}, 'cached_conv2_14': {0: 'N'}, 'cached_key_15': {1: 'N'}, 'cached_nonlin_attn_15': {1: 'N'}, 'cached_val1_15': {1: 'N'}, 'cached_val2_15': {1: 'N'}, 'cached_conv1_15': {0: 'N'}, 'cached_conv2_15': {0: 'N'}, 'cached_key_16': {1: 'N'}, 'cached_nonlin_attn_16': {1: 'N'}, 'cached_val1_16': {1: 'N'}, 'cached_val2_16': {1: 'N'}, 'cached_conv1_16': {0: 'N'}, 'cached_conv2_16': {0: 'N'}, 'cached_key_17': {1: 'N'}, 'cached_nonlin_attn_17': {1: 'N'}, 'cached_val1_17': {1: 'N'}, 'cached_val2_17': {1: 'N'}, 'cached_conv1_17': {0: 'N'}, 'cached_conv2_17': {0: 'N'}, 'cached_key_18': {1: 'N'}, 'cached_nonlin_attn_18': {1: 'N'}, 'cached_val1_18': {1: 'N'}, 'cached_val2_18': {1: 'N'}, 'cached_conv1_18': {0: 'N'}, 'cached_conv2_18': {0: 'N'}, 'embed_states': {0: 'N'}, 'processed_lens': {0: 'N'}} 2024-12-03 13:57:05,160 INFO [export-onnx-streaming.py:468] {'new_cached_key_0': {1: 'N'}, 'new_cached_nonlin_attn_0': {1: 'N'}, 'new_cached_val1_0': {1: 'N'}, 'new_cached_val2_0': {1: 'N'}, 'new_cached_conv1_0': {0: 'N'}, 'new_cached_conv2_0': {0: 'N'}, 'new_cached_key_1': {1: 'N'}, 'new_cached_nonlin_attn_1': {1: 'N'}, 'new_cached_val1_1': {1: 'N'}, 'new_cached_val2_1': {1: 'N'}, 'new_cached_conv1_1': {0: 'N'}, 'new_cached_conv2_1': {0: 'N'}, 'new_cached_key_2': {1: 'N'}, 'new_cached_nonlin_attn_2': {1: 'N'}, 'new_cached_val1_2': {1: 'N'}, 'new_cached_val2_2': {1: 'N'}, 'new_cached_conv1_2': {0: 'N'}, 'new_cached_conv2_2': {0: 'N'}, 'new_cached_key_3': {1: 'N'}, 'new_cached_nonlin_attn_3': {1: 'N'}, 'new_cached_val1_3': {1: 'N'}, 'new_cached_val2_3': {1: 'N'}, 'new_cached_conv1_3': {0: 'N'}, 'new_cached_conv2_3': {0: 'N'}, 'new_cached_key_4': {1: 'N'}, 'new_cached_nonlin_attn_4': {1: 'N'}, 'new_cached_val1_4': {1: 'N'}, 'new_cached_val2_4': {1: 'N'}, 'new_cached_conv1_4': {0: 'N'}, 'new_cached_conv2_4': {0: 'N'}, 'new_cached_key_5': {1: 'N'}, 'new_cached_nonlin_attn_5': {1: 'N'}, 'new_cached_val1_5': {1: 'N'}, 'new_cached_val2_5': {1: 'N'}, 'new_cached_conv1_5': {0: 'N'}, 'new_cached_conv2_5': {0: 'N'}, 'new_cached_key_6': {1: 'N'}, 'new_cached_nonlin_attn_6': {1: 'N'}, 'new_cached_val1_6': {1: 'N'}, 'new_cached_val2_6': {1: 'N'}, 'new_cached_conv1_6': {0: 'N'}, 'new_cached_conv2_6': {0: 'N'}, 'new_cached_key_7': {1: 'N'}, 'new_cached_nonlin_attn_7': {1: 'N'}, 'new_cached_val1_7': {1: 'N'}, 'new_cached_val2_7': {1: 'N'}, 'new_cached_conv1_7': {0: 'N'}, 'new_cached_conv2_7': {0: 'N'}, 'new_cached_key_8': {1: 'N'}, 'new_cached_nonlin_attn_8': {1: 'N'}, 2024-12-03 13:57:05,160 INFO [export-onnx-streaming.py:469] ['x', 'cached_key_0', 'cached_nonlin_attn_0', 'cached_val1_0', 'cached_val2_0', 'cached_conv1_0', 'cached_conv2_0', 'cached_key_1', 'cached_nonlin_attn_1', 'cached_val1_1', 'cached_val2_1', 'cached_conv1_1', 'cached_conv2_1', 'cached_key_2', 'cached_nonlin_attn_2', 'cached_val1_2', 'cached_val2_2', 'cached_conv1_2', 'cached_conv2_2', 'cached_key_3', 'cached_nonlin_attn_3', 'cached_val1_3', 'cached_val2_3', 'cached_conv1_3', 'cached_conv2_3', 'cached_key_4', 'cached_nonlin_attn_4', 'cached_val1_4', 'cached_val2_4', 'cached_conv1_4', 'cached_conv2_4', 'cached_key_5', 'cached_nonlin_attn_5', 'cached_val1_5', 'cached_val2_5', 'cached_conv1_5', 'cached_conv2_5', 'cached_key_6', 'cached_nonlin_attn_6', 'cached_val1_6', 'cached_val2_6', 'cached_conv1_6', 'cached_conv2_6', 'cached_key_7', 'cached_nonlin_attn_7', 'cached_val1_7', 'cached_val2_7', 'cached_conv1_7', 'cached_conv2_7', 'cached_key_8', 'cached_nonlin_attn_8', 'cached_val1_8', 'cached_val2_8', 'cached_conv1_8', 'cached_conv2_8', 'cached_key_9', 'cached_nonlin_attn_9', 'cached_val1_9', 'cached_val2_9', 'cached_conv1_9', 'cached_conv2_9', 'cached_key_10', 'cached_nonlin_attn_10', 'cached_val1_10', 'cached_val2_10', 'cached_conv1_10', 'cached_conv2_10', 'cached_key_11', 'cached_nonlin_attn_11', 'cached_val1_11', 'cached_val2_11', 'cached_conv1_11', 'cached_conv2_11', 'cached_key_12', 'cached_nonlin_attn_12', 'cached_val1_12', 'cached_val2_12', 'cached_conv1_12', 'cached_conv2_12', 'cached_key_13', 'cached_nonlin_attn_13', 'cached_val1_13', 'cached_val2_13', 'cached_conv1_13', 'cached_conv2_13', 'cached_key_14', 'cached_nonlin_attn_14', 'cached_val1_14', 'cached_val2_14', 'cached_conv1_14', 'cached_conv2_14', 'cached_key_15', 'cached_nonlin_attn_15', 'cached_val1_15', 'cached_val2_15', 'cached_conv1_15', 'cached_conv2_15', 'cached_key_16', 'cached_nonlin_attn_16', 'cached_val1_16', 'cached_val2_16', 'cached_conv1_16', 'cached_conv2_16', 'cached_key_17', 'cached_nonlin_attn_17', 'cached_val1_17', 'cached_val2_17', 'cached_conv1_17', 'cached_conv2_17', 'cached_key_18', 'cached_nonlin_attn_18', 'cached_val1_18', 'cached_val2_18', 'cached_conv1_18', 'cached_conv2_18', 'embed_states', 'processed_lens'] 2024-12-03 13:57:05,160 INFO [export-onnx-streaming.py:470] ['encoder_out', 'new_cached_key_0', 'new_cached_nonlin_attn_0', 'new_cached_val1_0', 'new_cached_val2_0', 'new_cached_conv1_0', 'new_cached_conv2_0', 'new_cached_key_1', 'new_cached_nonlin_attn_1', 'new_cached_val1_1', 'new_cached_val2_1', 'new_cached_conv1_1', 'new_cached_conv2_1', 'new_cached_key_2', 'new_cached_nonlin_attn_2', 'new_cached_val1_2', 'new_cached_val2_2', 'new_cached_conv1_2', 'new_cached_conv2_2', 'new_cached_key_3', 'new_cached_nonlin_attn_3', 'new_cached_val1_3', 'new_cached_val2_3', 'new_cached_conv1_3', 'new_cached_conv2_3', 'new_cached_key_4', 'new_cached_nonlin_attn_4', 'new_cached_val1_4', 'new_cached_val2_4', 'new_cached_conv1_4', 'new_cached_conv2_4', 'new_cached_key_5', 'new_cached_nonlin_attn_5', 'new_cached_val1_5', 'new_cached_val2_5', 'new_cached_conv1_5', 'new_cached_conv2_5', 'new_cached_key_6', 'new_cached_nonlin_attn_6', 'new_cached_val1_6', 'new_cached_val2_6', 'new_cached_conv1_6', 'new_cached_conv2_6', 'new_cached_key_7', 'new_cached_nonlin_attn_7', 'new_cached_val1_7', 'new_cached_val2_7', 'new_cached_conv1_7', 'new_cached_conv2_7', 'new_cached_key_8', 'new_cached_nonlin_attn_8', 'new_cached_val1_8', 'new_cached_val2_8', 'new_cached_conv1_8', 'new_cached_conv2_8', 'new_cached_key_9', 'new_cached_nonlin_attn_9', 'new_cached_val1_9', 'new_cached_val2_9', 'new_cached_conv1_9', 'new_cached_conv2_9', 'new_cached_key_10', 'new_cached_nonlin_attn_10', 'new_cached_val1_10', 'new_cached_val2_10', 'new_cached_conv1_10', 'new_cached_conv2_10', 'new_cached_key_11', 'new_cached_nonlin_attn_11', 'new_cached_val1_11', 'new_cached_val2_11', 'new_cached_conv1_11', 'new_cached_conv2_11' /home/jml/audio_icefall/icefall/egs/multi_zh-hans/ASR/./zipformer/export-onnx-streaming.py:216: TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect. x_lens = torch.tensor([T] N, device=x.device) /home/jml/audio_icefall/icefall/egs/librispeech/ASR/zipformer/scaling.py:1510: TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect. zero = torch.tensor(0.0, dtype=x.dtype, device=x.device) /home/jml/audio_icefall/icefall/egs/librispeech/ASR/zipformer/subsampling.py:169: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! assert cached_left_pad.size(2) == padding[0], ( /home/jml/audio_icefall/icefall/egs/librispeech/ASR/zipformer/scaling.py:1436: TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect. zero = torch.tensor(0.0, dtype=x.dtype, device=x.device) /home/jml/audio_icefall/icefall/egs/librispeech/ASR/zipformer/scaling.py:484: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! assert x.shape[self.channel_dim] == self.num_channels /home/jml/audio_icefall/icefall/egs/librispeech/ASR/zipformer/subsampling.py:385: TracerWarning: Converting a tensor to a Python number might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! assert x.size(1) == x_lens.max().item(), (x.shape, x_lens.max()) /home/jml/audio_icefall/icefall/egs/librispeech/ASR/zipformer/subsampling.py:385: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! assert x.size(1) == x_lens.max().item(), (x.shape, x_lens.max()) /home/jml/audio_icefall/icefall/egs/multi_zh-hans/ASR/./zipformer/export-onnx-streaming.py:225: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! assert x.size(1) == self.chunk_size, (x.size(1), self.chunk_size) 2024-12-03 13:57:05,315 INFO [export-onnx-streaming.py:243] len_encoder_states=114 /home/jml/audio_icefall/icefall/egs/librispeech/ASR/zipformer/scaling.py:1698: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! if num_channels <= x.shape[-1]: /home/jml/audio_icefall/icefall/egs/librispeech/ASR/zipformer/zipformer.py:1446: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! if self.pe.size(0) >= T 2 - 1: /home/jml/audio_icefall/icefall/egs/librispeech/ASR/zipformer/zipformer.py:1791: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! assert p.shape[-1] == num_heads pos_head_dim /home/jml/audio_icefall/icefall/egs/librispeech/ASR/zipformer/zipformer.py:1794: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! assert cached_key.shape[0] == left_context_len, ( /home/jml/audio_icefall/icefall/egs/librispeech/ASR/zipformer/zipformer.py:1853: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! assert attn_scores.shape == ( /home/jml/audio_icefall/icefall/egs/librispeech/ASR/zipformer/zipformer.py:1861: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! assert key_padding_mask.shape == (batch_size, k_len), key_padding_mask.shape /home/jml/audio_icefall/icefall/egs/librispeech/ASR/zipformer/zipformer.py:2194: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! assert attn_weights.shape == ( /home/jml/audio_icefall/icefall/egs/librispeech/ASR/zipformer/zipformer.py:2205: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! assert cached_x.shape[2] == left_context_len, ( /home/jml/audio_icefall/icefall/egs/librispeech/ASR/zipformer/zipformer.py:1982: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! assert attn_weights.shape == (num_heads, batch_size, seq_len, seq_len2) /home/jml/audio_icefall/icefall/egs/librispeech/ASR/zipformer/zipformer.py:1987: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! assert cached_val.shape[0] == left_context_len, ( /home/jml/audio_icefall/icefall/egs/librispeech/ASR/zipformer/scaling.py:735: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! assert cache.shape[-1] == left_pad, (cache.shape[-1], left_pad) /home/jml/audio_icefall/icefall/egs/librispeech/ASR/zipformer/scaling.py:741: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! assert x_causal.shape == (batch_size, num_channels, seq_len) /home/jml/audio_icefall/icefall/egs/librispeech/ASR/zipformer/scaling.py:702: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! if chunk_size < self.kernel_size: /home/jml/audio_icefall/icefall/egs/librispeech/ASR/zipformer/zipformer.py:1350: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! assert ( /home/jml/audio_icefall/icefall/egs/librispeech/ASR/zipformer/zipformer.py:1359: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! assert src.shape[0] == d_seq_len ds, (src.shape, d_seq_len, ds) 2024-12-03 13:58:17,599 INFO [export-onnx-streaming.py:738] Exported encoder to zipformer/exp20241202/encoder-epoch-9999-avg-1-chunk-16-left-128.onnx 2024-12-03 13:58:17,600 INFO [export-onnx-streaming.py:740] Exporting decoder 2024-12-03 13:58:17,725 INFO [export-onnx-streaming.py:747] Exported decoder to zipformer/exp20241202/decoder-epoch-9999-avg-1-chunk-16-left-128.onnx 2024-12-03 13:58:17,725 INFO [export-onnx-streaming.py:749] Exporting joiner 2024-12-03 13:58:17,725 INFO [export-onnx-streaming.py:556] joiner dim: 512 2024-12-03 13:58:17,768 INFO [export-onnx-streaming.py:756] Exported joiner to zipformer/exp20241202/joiner-epoch-9999-avg-1-chunk-16-left-128.onnx 2024-12-03 13:58:17,768 INFO [export-onnx-streaming.py:781] Generate int8 quantization models 2024-12-03 13:58:22,011 WARNING [quantize.py:664] Please consider to run pre-processing before quantization. Refer to example: https://github.com/microsoft/onnxruntime-inference-examples/blob/main/quantization/image_classification/cpu/ReadMe.md 2024-12-03 13:58:23,247 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Reshape_output_0" not specified 2024-12-03 13:58:23,264 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Slice_3_output_0" not specified 2024-12-03 13:58:23,268 INFO [matmul.py:30] Ignore MatMul due to non constant B: /[/MatMul] 2024-12-03 13:58:23,275 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/encoder_pos/Unsqueeze_2_output_0" not specified 2024-12-03 13:58:23,278 INFO [matmul.py:30] Ignore MatMul due to non constant B: /[/MatMul_1]

2024-12-03 13:58:24,020 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Add_72_output_0" not specified 2024-12-03 13:58:24,026 INFO [matmul.py:30] Ignore MatMul due to non constant B: /[/MatMul_17] 2024-12-03 13:58:24,033 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Mul_59_output_0" not specified 2024-12-03 13:58:24,044 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Add_74_output_0" not specified 2024-12-03 13:58:24,048 INFO [matmul.py:30] Ignore MatMul due to non constant B: /[/MatMul_18] 2024-12-03 13:58:24,055 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Reshape_51_output_0" not specified 2024-12-03 13:58:24,065 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Add_76_output_0" not specified 2024-12-03 13:58:24,167 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/out_proj_17/Sub_2_output_0" not specified 2024-12-03 13:58:24,180 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Add_81_output_0" not specified 2024-12-03 13:58:24,193 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/feed_forward2/out_proj_3/Sub_2_output_0" not specified 2024-12-03 13:58:24,205 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/bypass_mid_3/Add_output_0" not specified 2024-12-03 13:58:24,209 INFO [matmul.py:30] Ignore MatMul due to non constant B: /[/MatMul_19] 2024-12-03 13:58:24,216 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Reshape_53_output_0" not specified 2024-12-03 13:58:24,227 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Add_84_output_0" not specified 2024-12-03 13:58:24,239 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/out_proj_19/Sub_2_output_0" not specified 2024-12-03 13:58:24,250 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Add_89_output_0" not specified 2024-12-03 13:58:24,263 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/feed_forward3/out_proj_3/Sub_2_output_0" not specified 2024-12-03 13:58:24,276 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/downsample_1/ReduceSum_output_0" not specified 2024-12-03 13:58:24,281 INFO [matmul.py:30] Ignore MatMul due to non constant B: /[/MatMul_20] 2024-12-03 13:58:24,288 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/encoder_pos_2/Unsqueeze_2_output_0" not specified 2024-12-03 13:58:24,292 INFO [matmul.py:30] Ignore MatMul due to non constant B: /[/MatMul_21] 2024-12-03 13:58:24,307 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/feed_forward1/out_proj_4/Sub_2_output_0" not specified 2024-12-03 13:58:24,320 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Add_94_output_0" not specified 2024-12-03 13:58:24,327 INFO [matmul.py:30] Ignore MatMul due to non constant B: /[/MatMul_22] 2024-12-03 13:58:24,335 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Mul_76_output_0" not specified 2024-12-03 13:58:24,347 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Add_96_output_0" not specified 2024-12-03 13:58:24,351 INFO [matmul.py:30] Ignore MatMul due to non constant B: /[/MatMul_23] 2024-12-03 13:58:24,359 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Reshape_64_output_0" not specified 2024-12-03 13:58:24,369 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Add_98_output_0" not specified 2024-12-03 13:58:24,382 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/out_proj_22/Sub_2_output_0" not specified 2024-12-03 13:58:24,394 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Add_103_output_0" not specified 2024-12-03 13:58:24,410 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/feed_forward2/out_proj_4/Sub_2_output_0" not specified 2024-12-03 13:58:24,425 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/bypass_mid_4/Add_output_0" not specified 2024-12-03 13:58:24,429 INFO [matmul.py:30] Ignore MatMul due to non constant B: /[/MatMul_24] 2024-12-03 13:58:24,437 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Reshape_66_output_0" not specified 2024-12-03 13:58:24,447 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Add_106_output_0" not specified 2024-12-03 13:58:24,461 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/out_proj_24/Sub_2_output_0" not specified 2024-12-03 13:58:24,473 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Add_111_output_0" not specified 2024-12-03 13:58:24,492 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/feed_forward3/out_proj_4/Sub_2_output_0" not specified 2024-12-03 13:58:24,514 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/bypass_4/Add_output_0" not specified 2024-12-03 13:58:24,519 INFO [matmul.py:30] Ignore MatMul due to non constant B: /[/MatMul_25] 2024-12-03 13:58:24,524 INFO [matmul.py:30] Ignore MatMul due to non constant B: /[/MatMul_26] 2024-12-03 13:58:24,630 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/feed_forward1/out_proj_5/Sub_2_output_0" not specified 2024-12-03 13:58:24,646 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Add_116_output_0" not specified 2024-12-03 13:58:24,653 INFO [matmul.py:30] Ignore MatMul due to non constant B: /[/MatMul_27] 2024-12-03 13:58:24,661 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Mul_93_output_0" not specified 2024-12-03 13:58:24,673 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Add_118_output_0" not specified 2024-12-03 13:58:24,678 INFO [matmul.py:30] Ignore MatMul due to non constant B: /[/MatMul_28] 2024-12-03 13:58:24,686 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Reshape_77_output_0" not specified 2024-12-03 13:58:24,697 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Add_120_output_0" not specified 2024-12-03 13:58:24,710 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/out_proj_27/Sub_2_output_0" not specified 2024-12-03 13:58:24,722 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Add_125_output_0" not specified 2024-12-03 13:58:24,737 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/feed_forward2/out_proj_5/Sub_2_output_0" not specified 2024-12-03 13:58:24,752 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/bypass_mid_5/Add_output_0" not specified 2024-12-03 13:58:24,757 INFO [matmul.py:30] Ignore MatMul due to non constant B: /[/MatMul_29] 2024-12-03 13:58:24,764 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Reshape_79_output_0" not specified 2024-12-03 13:58:24,776 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Add_128_output_0" not specified 2024-12-03 13:58:24,789 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/out_proj_29/Sub_2_output_0" not specified 2024-12-03 13:58:24,802 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Add_133_output_0" not specified 2024-12-03 13:58:24,821 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/feed_forward3/out_proj_5/Sub_2_output_0" not specified 2024-12-03 13:58:24,844 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/bypass_5/Add_output_0" not specified 2024-12-03 13:58:24,850 INFO [matmul.py:30] Ignore MatMul due to non constant B: /[/MatMul_30] 2024-12-03 13:58:24,856 INFO [matmul.py:30] Ignore MatMul due to non constant B: /[/MatMul_31] 2024-12-03 13:58:24,875 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/feed_forward1/out_proj_6/Sub_2_output_0" not specified 2024-12-03 13:58:24,888 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Add_138_output_0" not specified 2024-12-03 13:58:24,896 INFO [matmul.py:30] Ignore MatMul due to non constant B: /[/MatMul_32] 2024-12-03 13:58:24,904 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Mul_110_output_0" not specified 2024-12-03 13:58:24,916 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Add_140_output_0" not specified 2024-12-03 13:58:24,921 INFO [matmul.py:30] Ignore MatMul due to non constant B: /[/MatMul_33] 2024-12-03 13:58:24,928 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Reshape_90_output_0" not specified 2024-12-03 13:58:24,939 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Add_142_output_0" not specified 2024-12-03 13:58:24,953 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/out_proj_32/Sub_2_output_0" not specified 2024-12-03 13:58:24,966 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Add_147_output_0" not specified 2024-12-03 13:58:24,983 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/feed_forward2/out_proj_6/Sub_2_output_0" not specified 2024-12-03 13:58:24,998 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/bypass_mid_6/Add_output_0" not specified 2024-12-03 13:58:25,003 INFO [matmul.py:30] Ignore MatMul due to non constant B: /[/MatMul_34] 2024-12-03 13:58:25,011 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Reshape_92_output_0" not specified 2024-12-03 13:58:25,022 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Add_150_output_0" not specified 2024-12-03 13:58:25,037 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/out_proj_34/Sub_2_output_0" not specified 2024-12-03 13:58:25,049 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Add_155_output_0" not specified 2024-12-03 13:58:25,070 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/feed_forward3/out_proj_6/Sub_2_output_0" not specified 2024-12-03 13:58:25,093 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/bypass_6/Add_output_0" not specified 2024-12-03 13:58:25,099 INFO [matmul.py:30] Ignore MatMul due to non constant B: /[/MatMul_35] 2024-12-03 13:58:25,105 INFO [matmul.py:30] Ignore MatMul due to non constant B: /[/MatMul_36] 2024-12-03 13:58:25,124 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/feed_forward1/out_proj_7/Sub_2_output_0" not specified 2024-12-03 13:58:25,139 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Add_160_output_0" not specified 2024-12-03 13:58:25,146 INFO [matmul.py:30] Ignore MatMul due to non constant B: /[/MatMul_37] 2024-12-03 13:58:25,154 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Mul_127_output_0" not specified 2024-12-03 13:58:25,167 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Add_162_output_0" not specified 2024-12-03 13:58:25,172 INFO [matmul.py:30] Ignore MatMul due to non constant B: /[/MatMul_38] 2024-12-03 13:58:25,180 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Reshape_103_output_0" not specified 2024-12-03 13:58:25,192 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Add_164_output_0" not specified 2024-12-03 13:58:25,206 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/out_proj_37/Sub_2_output_0" not specified 2024-12-03 13:58:25,219 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Add_169_output_0" not specified 2024-12-03 13:58:25,237 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/feed_forward2/out_proj_7/Sub_2_output_0" not specified 2024-12-03 13:58:25,252 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/bypass_mid_7/Add_output_0" not specified 2024-12-03 13:58:25,258 INFO [matmul.py:30] Ignore MatMul due to non constant B: /[/MatMul_39] 2024-12-03 13:58:25,357 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Reshape_105_output_0" not specified 2024-12-03 13:58:25,370 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Add_172_output_0" not specified 2024-12-03 13:58:25,385 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/out_proj_39/Sub_2_output_0" not specified 2024-12-03 13:58:25,398 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Add_177_output_0" not specified 2024-12-03 13:58:25,419 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/feed_forward3/out_proj_7/Sub_2_output_0" not specified 2024-12-03 13:58:25,443 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/downsample_2/ReduceSum_output_0" not specified 2024-12-03 13:58:25,835 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Add_204_output_0" not specified 2024-12-03 13:58:25,848 INFO [matmul.py:30] Ignore MatMul due to non constant B: /[/MatMul_47] 2024-12-03 13:58:25,857 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Mul_161_output_0" not specified 2024-12-03 13:58:25,872 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Add_206_output_0" not specified 2024-12-03 13:58:25,878 INFO [matmul.py:30] Ignore MatMul due to non constant B: /[/MatMul_48] 2024-12-03 13:58:25,886 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Reshape_129_output_0" not specified 2024-12-03 13:58:25,898 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Add_208_output_0" not specified 2024-12-03 13:58:25,921 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/out_proj_47/Sub_2_output_0" not specified 2024-12-03 13:58:25,937 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Add_213_output_0" not specified 2024-12-03 13:58:25,962 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/feed_forward2/out_proj_9/Sub_2_output_0" not specified 2024-12-03 13:58:25,989 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/bypass_mid_9/Add_output_0" not specified 2024-12-03 13:58:25,995 INFO [matmul.py:30] Ignore MatMul due to non constant B: /[/MatMul_49] 2024-12-03 13:58:26,003 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Reshape_131_output_0" not specified 2024-12-03 13:58:26,016 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Add_216_output_0" not specified 2024-12-03 13:58:26,039 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/out_proj_49/Sub_2_output_0" not specified 2024-12-03 13:58:26,054 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Add_221_output_0" not specified 2024-12-03 13:58:26,084 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/feed_forward3/out_proj_9/Sub_2_output_0" not specified 2024-12-03 13:58:26,116 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/bypass_9/Add_output_0" not specified 2024-12-03 13:58:26,125 INFO [matmul.py:30] Ignore MatMul due to non constant B: /[/MatMul_50] 2024-12-03 13:58:26,132 INFO [matmul.py:30] Ignore MatMul due to non constant B: /[/MatMul_51] 2024-12-03 13:58:26,155 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/feed_forward1/out_proj_10/Sub_2_output_0" not specified 2024-12-03 13:58:26,173 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Add_226_output_0" not specified 2024-12-03 13:58:26,187 INFO [matmul.py:30] Ignore MatMul due to non constant B: /[/MatMul_52] 2024-12-03 13:58:26,196 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Mul_178_output_0" not specified 2024-12-03 13:58:26,211 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Add_228_output_0" not specified 2024-12-03 13:58:26,218 INFO [matmul.py:30] Ignore MatMul due to non constant B: /[/MatMul_53] 2024-12-03 13:58:26,226 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Reshape_142_output_0" not specified 2024-12-03 13:58:26,239 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Add_230_output_0" not specified 2024-12-03 13:58:26,262 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/out_proj_52/Sub_2_output_0" not specified 2024-12-03 13:58:26,278 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Add_235_output_0" not specified 2024-12-03 13:58:26,303 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/feed_forward2/out_proj_10/Sub_2_output_0" not specified 2024-12-03 13:58:26,330 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/bypass_mid_10/Add_output_0" not specified 2024-12-03 13:58:26,337 INFO [matmul.py:30] Ignore MatMul due to non constant B: /[/MatMul_54] 2024-12-03 13:58:26,345 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Reshape_144_output_0" not specified 2024-12-03 13:58:26,358 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Add_238_output_0" not specified 2024-12-03 13:58:26,382 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/out_proj_54/Sub_2_output_0" not specified 2024-12-03 13:58:26,398 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Add_243_output_0" not specified 2024-12-03 13:58:26,428 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/feed_forward3/out_proj_10/Sub_2_output_0" not specified export log.txt

2024-12-03 13:58:27,071 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Add_279_output_0" not specified 2024-12-03 13:58:27,097 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/feed_forward2/out_proj_12/Sub_2_output_0" not specified 2024-12-03 13:58:27,125 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/bypass_mid_12/Add_output_0" not specified 2024-12-03 13:58:27,133 INFO [matmul.py:30] Ignore MatMul due to non constant B: /[/MatMul_64] 2024-12-03 13:58:27,141 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Reshape_170_output_0" not specified 2024-12-03 13:58:27,155 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Add_282_output_0" not specified 2024-12-03 13:58:27,180 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/out_proj_64/Sub_2_output_0" not specified 2024-12-03 13:58:27,199 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Add_287_output_0" not specified 2024-12-03 13:58:27,231 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/feed_forward3/out_proj_12/Sub_2_output_0" not specified 2024-12-03 13:58:27,262 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/downsample_3/ReduceSum_output_0" not specified 2024-12-03 13:58:27,270 INFO [matmul.py:30] Ignore MatMul due to non constant B: /[/MatMul_65] 2024-12-03 13:58:27,278 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/encoder_pos_4/Unsqueeze_2_output_0" not specified 2024-12-03 13:58:28,916 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Reshape_248_output_0" not specified 2024-12-03 13:58:28,932 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Add_414_output_0" not specified 2024-12-03 13:58:28,948 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/out_proj_94/Sub_2_output_0" not specified 2024-12-03 13:58:28,963 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Add_419_output_0" not specified 2024-12-03 13:58:28,980 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/feed_forward3/out_proj_18/Sub_2_output_0" not specified 2024-12-03 13:58:28,997 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Transpose_268_output_0" not specified 2024-12-03 13:58:29,775 WARNING [quantize.py:664] Please consider to run pre-processing before quantization. Refer to example: https://github.com/microsoft/onnxruntime-inference-examples/blob/main/quantization/image_classification/cpu/ReadMe.md 2024-12-03 13:58:29,790 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Squeeze_output_0" not specified 2024-12-03 13:58:29,815 WARNING [quantize.py:664] Please consider to run pre-processing before quantization. Refer to example: https://github.com/microsoft/onnxruntime-inference-examples/blob/main/quantization/image_classification/cpu/ReadMe.md 2024-12-03 13:58:29,829 INFO [onnx_quantizer.py:513] Quantization parameters for tensor:"/Tanh_output_0" not specified

Export successfully. and at /home/jml/audio_icefall/icefall/egs/multi_zh-hans/ASR/zipformer/exp20241202 ,there are six onnx model created like encoder-epoch-9999-avg-1-chunk-16-left-128.onnx and so on. and I modified the encoder\decoder\joiner path at /home/jml/sherpa-onnx/python-api-examples/streaming_server.py to the new onnx path by: parser.add_argument( "--encoder", default='/home/jml/audio_icefall/icefall/egs/multi_zh-hans/ASR/zipformer/exp20241202/encoder-epoch-9999-avg-1-chunk-16-left-128.int8.onnx', type=str, help="Path to the transducer encoder model", ) and so on. i launched the server successfully, but at the client side, I can't get any of the output. the mic is ok. the server log is:

CUDA_VISIBLE_DEVICES=1 python3 ./python-api-examples/streaming_server.py ***puncmodel loaded**** 2024-12-03 14:00:56,809 INFO [streaming_server.py:921] {'encoder': '/home/jml/audio_icefall/icefall/egs/multi_zh-hans/ASR/zipformer/exp20241202/encoder-epoch-9999-avg-1-chunk-16-left-128.int8.onnx', 'decoder': '/home/jml/audio_icefall/icefall/egs/multi_zh-hans/ASR/zipformer/exp20241202/decoder-epoch-9999-avg-1-chunk-16-left-128.int8.onnx', 'joiner': '/home/jml/audio_icefall/icefall/egs/multi_zh-hans/ASR/zipformer/exp20241202/joiner-epoch-9999-avg-1-chunk-16-left-128.int8.onnx', 'zipformer2_ctc': None, 'wenet_ctc': None, 'paraformer_encoder': None, 'paraformer_decoder': None, 'tokens': '/home/jml/sherpa-onnx/sherpa-onnx-streaming-zipformer-multi-zh-hans-2023-12-12/tokens.txt', 'sample_rate': 16000, 'feat_dim': 80, 'provider': 'cuda', 'decoding_method': 'greedy_search', 'num_active_paths': 4, 'use_endpoint': 0, 'rule1_min_trailing_silence': 2.4, 'rule2_min_trailing_silence': 1.2, 'rule3_min_utterance_length': 20, 'hotwords_file': '', 'hotwords_score': 1.5, 'blank_penalty': 2.0, 'punc_model': '/home/jml/sherpa-onnx/sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12/model.onnx', 'port': 8822, 'nn_pool_size': 1, 'max_batch_size': 3, 'max_wait_ms': 10, 'max_message_size': 1048576, 'max_queue_size': 32, 'max_active_connections': 200, 'num_threads': 2, 'certificate': '/home/jml/sherpa-onnx/python-api-examples/web/test.pem', 'doc_root': './python-api-examples/web'} 2024-12-03 14:01:07,628 INFO [streaming_server.py:690] Using certificate: /home/jml/sherpa-onnx/python-api-examples/web/test.pem 2024-12-03 14:01:07,633 INFO [server.py:715] server listening on 0.0.0.0:8822 2024-12-03 14:01:07,634 INFO [server.py:715] server listening on [::]:8822 2024-12-03 14:01:07,637 INFO [streaming_server.py:723] Please visit one of the following addresses:

https://localhost:8822 https://0.0.0.0:8822 https://127.0.0.1:8822 https://172.17.0.4:8822

2024-12-03 14:01:10,851 INFO [server.py:652] connection open 2024-12-03 14:01:10,852 INFO [streaming_server.py:779] Connected: ('192.168.52.128', 63453). Number of connections: 1/200 message: {'text': '', 'segment': 0} message: {'text': '', 'segment': 0} message: {'text': '', 'segment': 0} message: {'text': '', 'segment': 0} message: {'text': '', 'segment': 0} message: {'text': '', 'segment': 0} message: {'text': '', 'segment': 0} message: {'text': '', 'segment': 0} message: {'text': '', 'segment': 0} message: {'text': '', 'segment': 0} message: {'text': '', 'segment': 0} message: {'text': '', 'segment': 0} message: {'text': '', 'segment': 0} message: {'text': '', 'segment': 0} message: {'text': '', 'segment': 0} 2024-12-03 14:01:25,353 INFO [streaming_server.py:748] Disconnected: None. Number of connections: 0/200

I wonder the reason. Thanks!

k2-fsa / icefall

export 'icefall-asr-multi-zh-hans-zipformer-large' to onnx successfully but deploy failed #1823