Open willnufe opened 3 months ago
@willnufe Need to make some modifications to the code in order to support it successfully. I don't have time recently, but if you are willing to do it, I can give you some suggestions offline.
@willnufe Need to make some modifications to the code in order to support it successfully. I don't have time recently, but if you are willing to do it, I can give you some suggestions offline.
Thank you very much. I want to make some attempts. Please give me some suggestions.
@willnufe I think to get the max throughput. We need to first make onnx fp16 paraformer work.
https://github.com/modelscope/FunASR/commit/9a9b474e7de7cc90d2ee124dc8d6c2cfa887c059. This PR used several registered_hook
to rescale the torchscript fp32 model to torchscript fp16 model. The first thing is to follow it to calibrate onnx fp32 model.
With onnx fp16, you could expect about 50% throughput improvement comparing with onnx fp32 pipeline. Then let's work on tensorrt export.
Would you mind adding my wechat ykzhang2020?
1. environment
1.1 pt to onnx(predictor的cif部分使用的是cif_v1):
pip
, source): pip1.2 onnx to tensorrt:
2. problem
使用下面的命令对 paraformer onnx-gpu 模型进行转换,报错
主要错误是: