where to get the onnx model for paraformer triton

Notice: In order to resolve issues more efficiently, please raise issue following the template. （注意：为了更加高效率解决您遇到的问题，请按照模板提问，补充细节）

🐛 Bug

hi, i follow the doc in offical doc, but the onnx model cannot be run.

how to fix this error, thanks.

To Reproduce

Steps to reproduce the behavior (always include the command you ran):

Run cmd '....'
See error

I1122 04:11:58.954133 2235 cuda_memory_manager.cc:105] CUDA memory pool is created on device 5 with size 67108864
I1122 04:11:58.954139 2235 cuda_memory_manager.cc:105] CUDA memory pool is created on device 6 with size 67108864
I1122 04:11:58.954145 2235 cuda_memory_manager.cc:105] CUDA memory pool is created on device 7 with size 67108864
W1122 04:11:59.800834 2235 server.cc:218] failed to enable peer access for some device pairs
I1122 04:11:59.818230 2235 model_lifecycle.cc:459] loading: encoder:1
I1122 04:11:59.818433 2235 model_lifecycle.cc:459] loading: feature_extractor:1
I1122 04:11:59.818682 2235 model_lifecycle.cc:459] loading: scoring:1
I1122 04:11:59.819953 2235 onnxruntime.cc:2459] TRITONBACKEND_Initialize: onnxruntime
I1122 04:11:59.819976 2235 onnxruntime.cc:2469] Triton TRITONBACKEND API version: 1.11
I1122 04:11:59.819982 2235 onnxruntime.cc:2475] 'onnxruntime' TRITONBACKEND API version: 1.11
I1122 04:11:59.819986 2235 onnxruntime.cc:2505] backend configuration:
{"cmdline":{"auto-complete-config":"true","min-compute-capability":"6.000000","backend-directory":"/opt/tritonserver/backends","default-max-batch-size":"4"}}
I1122 04:11:59.854786 2235 onnxruntime.cc:2563] TRITONBACKEND_ModelInitialize: encoder (version 1)
I1122 04:11:59.855573 2235 onnxruntime.cc:666] skipping model configuration auto-complete for 'encoder': inputs and outputs already specified
I1122 04:11:59.856428 2235 onnxruntime.cc:2606] TRITONBACKEND_ModelInstanceInitialize: encoder_0 (GPU device 0)
I1122 04:11:59.938595 2235 onnxruntime.cc:2640] TRITONBACKEND_ModelInstanceFinalize: delete instance state
I1122 04:11:59.938677 2235 onnxruntime.cc:2586] TRITONBACKEND_ModelFinalize: delete model state
E1122 04:11:59.938736 2235 model_lifecycle.cc:597] failed to load 'encoder' version 1: Internal: onnx runtime error 7: Load model from /workspace/model_repo_paraformer_large_offline/encoder/1/model.onnx failed:Protobuf parsing failed.
free(): invalid pointer
I1122 04:12:03.869442 2235 python_be.cc:1539] Input tensors can be both in CPU and GPU. FORCE_CPU_ONLY_INPUT_TENSORS is off.
I1122 04:12:07.029930 2235 python_be.cc:1858] TRITONBACKEND_ModelInstanceInitialize: feature_extractor_0_0 (GPU device 0)
I1122 04:12:09.516798 2235 python_be.cc:1858] TRITONBACKEND_ModelInstanceInitialize: scoring_0_0 (CPU device 0)
I1122 04:12:11.243169 2235 python_be.cc:1858] TRITONBACKEND_ModelInstanceInitialize: feature_extractor_0_0 (GPU device 1)
I1122 04:12:13.784515 2235 python_be.cc:1858] TRITONBACKEND_ModelInstanceInitialize: scoring_0_1 (CPU device 0)
I1122 04:12:16.345823 2235 python_be.cc:1858] TRITONBACKEND_ModelInstanceInitialize: feature_extractor_0_0 (GPU device 2)
I1122 04:12:16.346221 2235 model_lifecycle.cc:694] successfully loaded 'scoring' version 1
I1122 04:12:18.872534 2235 python_be.cc:1858] TRITONBACKEND_ModelInstanceInitialize: feature_extractor_0_0 (GPU device 3)
I1122 04:12:21.563355 2235 python_be.cc:1858] TRITONBACKEND_ModelInstanceInitialize: feature_extractor_0_0 (GPU device 4)
I1122 04:12:24.108015 2235 python_be.cc:1858] TRITONBACKEND_ModelInstanceInitialize: feature_extractor_0_0 (GPU device 5)
I1122 04:12:26.793518 2235 python_be.cc:1858] TRITONBACKEND_ModelInstanceInitialize: feature_extractor_0_0 (GPU device 6)
I1122 04:12:29.537467 2235 python_be.cc:1858] TRITONBACKEND_ModelInstanceInitialize: feature_extractor_0_0 (GPU device 7)
I1122 04:12:32.366262 2235 python_be.cc:1858] TRITONBACKEND_ModelInstanceInitialize: feature_extractor_0_1 (GPU device 0)
I1122 04:12:34.954982 2235 python_be.cc:1858] TRITONBACKEND_ModelInstanceInitialize: feature_extractor_0_1 (GPU device 1)
I1122 04:12:37.568989 2235 python_be.cc:1858] TRITONBACKEND_ModelInstanceInitialize: feature_extractor_0_1 (GPU device 2)
I1122 04:12:40.277440 2235 python_be.cc:1858] TRITONBACKEND_ModelInstanceInitialize: feature_extractor_0_1 (GPU device 3)
I1122 04:12:42.838045 2235 python_be.cc:1858] TRITONBACKEND_ModelInstanceInitialize: feature_extractor_0_1 (GPU device 4)
I1122 04:12:45.385807 2235 python_be.cc:1858] TRITONBACKEND_ModelInstanceInitialize: feature_extractor_0_1 (GPU device 5)
I1122 04:12:48.085306 2235 python_be.cc:1858] TRITONBACKEND_ModelInstanceInitialize: feature_extractor_0_1 (GPU device 6)
free(): invalid pointer
free(): invalid pointer
free(): invalid pointer
free(): invalid pointer
free(): invalid pointer
free(): invalid pointer
free(): invalid pointer
free(): invalid pointer
free(): invalid pointer
free(): invalid pointer

Code sample

Expected behavior

Environment

OS (e.g., Linux):
FunASR Version (e.g., 1.0.0):
ModelScope Version (e.g., 1.11.0):
PyTorch Version (e.g., 2.0.0):
How you installed funasr (pip, source):
Python version:
GPU (e.g., V100M32)
CUDA/cuDNN version (e.g., cuda11.7):
Docker version (e.g., funasr-runtime-sdk-cpu-0.4.1)
Any other relevant information:

modelscope / FunASR

where to get the onnx model for paraformer triton #2224

🐛 Bug

To Reproduce

Code sample

Expected behavior

Environment

Additional context