Open Vergissmeinicht opened 3 weeks ago
@yuekaizhang Could you have a look?
https://github.com/k2-fsa/sherpa/tree/master/triton/scripts Have checked the scripts here but only conformer trt script (triton/scripts/build_librispeech_pruned_transducer_stateless3_offline_trt.sh) released. Is it ok for zipformer to do export-onnx -> trtexec to get tensorrt engine too?
@Vergissmeinicht Not yet, let me do it and I will give update here.
https://github.com/k2-fsa/sherpa/tree/master/triton/scripts Have checked the scripts here but only conformer trt script (triton/scripts/build_librispeech_pruned_transducer_stateless3_offline_trt.sh) released. Is it ok for zipformer to do export-onnx -> trtexec to get tensorrt engine too?
@Vergissmeinicht Not yet, let me do it and I will give update here.
Thanks! FYI, I've tried the onnx model from (https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-transducer/zipformer-transducer-models.html#sherpa-onnx-zipformer-gigaspeech-2023-12-12-english) to do onnx export and trtexec, but trtexec fails while parsing softmax op with 1-d input. Then I try onnx-graphsurgeon to fix this 1-d input problem, but trtexec still fails with if-conditional outputs which comes from CompactRelPositionalEncoding.
@Vergissmeinicht Just comment the lines should be okay https://github.com/k2-fsa/icefall/blob/master/egs/librispeech/ASR/zipformer/zipformer.py#L1422-L1427.
@Vergissmeinicht Just comment the lines should be okay https://github.com/k2-fsa/icefall/blob/master/egs/librispeech/ASR/zipformer/zipformer.py#L1422-L1427.
It works for me. But when I try using trtexec to convert the zipformer onnx model from my teammate, it fails while parsing Slice node, saying that "This version of TensorRT does not supoort dynamic axes". Maybe my icefall version does not match his. Any solution to parse this Slice op?
@Vergissmeinicht Just comment the lines should be okay https://github.com/k2-fsa/icefall/blob/master/egs/librispeech/ASR/zipformer/zipformer.py#L1422-L1427.
It works for me. But when I try using trtexec to convert the zipformer onnx model from my teammate, it fails while parsing Slice node, saying that "This version of TensorRT does not supoort dynamic axes". Maybe my icefall version does not match his. Any solution to parse this Slice op?
@Vergissmeinicht Pleasae use the latest tensorrt e.g. trt 10.2 in tritonserver:24.07-py3.
@Vergissmeinicht Just comment the lines should be okay https://github.com/k2-fsa/icefall/blob/master/egs/librispeech/ASR/zipformer/zipformer.py#L1422-L1427.
It works for me. But when I try using trtexec to convert the zipformer onnx model from my teammate, it fails while parsing Slice node, saying that "This version of TensorRT does not supoort dynamic axes". Maybe my icefall version does not match his. Any solution to parse this Slice op?
@Vergissmeinicht Pleasae use the latest tensorrt e.g. trt 10.2 in tritonserver:24.07-py3.
I follow the latest turtorial to run build_wenetspeech_zipformer_offline_trt.sh. It fails due to oom where tactic device request 34024MB (my 4090ti has 24217MB available). Do you use other gpu with larger memory?
I follow the latest turtorial to run build_wenetspeech_zipformer_offline_trt.sh. It fails due to oom where tactic device request 34024MB (my 4090ti has 24217MB available). Do you use other gpu with larger memory?
Are you using the larger model comparing with the model in build_wenetspeech_zipformer_offline_trt.sh?
Would you mind changing the option? https://github.com/NVIDIA/trt-samples-for-hackathon-cn/blob/master/cookbook/07-Tool/trtexec/Help.txt#L37
build_wenetspeech_zipformer_offline_trt.sh
I use the model downloaded from https://github.com/k2-fsa/sherpa/blob/master/triton/scripts/build_wenetspeech_zipformer_offline_trt.sh#L47C5-L47C110. The docker I use is soar97/triton-k2:24.07.
I follow the latest turtorial to run build_wenetspeech_zipformer_offline_trt.sh. It fails due to oom where tactic device request 34024MB (my 4090ti has 24217MB available). Do you use other gpu with larger memory?
Are you using the larger model comparing with the model in build_wenetspeech_zipformer_offline_trt.sh?
Would you mind changing the option? https://github.com/NVIDIA/trt-samples-for-hackathon-cn/blob/master/cookbook/07-Tool/trtexec/Help.txt#L37
Here's the building log. Maybe there's something different. log.txt
I follow the latest turtorial to run build_wenetspeech_zipformer_offline_trt.sh. It fails due to oom where tactic device request 34024MB (my 4090ti has 24217MB available). Do you use other gpu with larger memory?
Are you using the larger model comparing with the model in build_wenetspeech_zipformer_offline_trt.sh? Would you mind changing the option? https://github.com/NVIDIA/trt-samples-for-hackathon-cn/blob/master/cookbook/07-Tool/trtexec/Help.txt#L37
Here's the building log. Maybe there's something different. log.txt
@yuekaizhang Hi, is there any progress on this problem? Appreciate your reply.
https://github.com/k2-fsa/sherpa/tree/master/triton/scripts Have checked the scripts here but only conformer trt script (triton/scripts/build_librispeech_pruned_transducer_stateless3_offline_trt.sh) released. Is it ok for zipformer to do export-onnx -> trtexec to get tensorrt engine too?