Open songkq opened 2 years ago
@songkq 首先这是2个问题,第一个问题,如果想确认TensorRT engine导出某一层的类型,请使用工具https://github.com/NVIDIA/TensorRT/tree/main/tools/experimental/trt-engine-explorer,可以生成engine的graph。第二个问题,会的。TensorRT PTQ 就是根据“速度优先”原则选择使用的算子类型,如果一个算子支持int8,但是fp16更快,则也会使用fp16。所以有时候PTQ生成的模型并非全是量化算子。详细的请参考https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#intro-quantization
@songkq 首先这是2个问题,第一个问题,如果想确认TensorRT engine导出某一层的类型,请使用工具https://github.com/NVIDIA/TensorRT/tree/main/tools/experimental/trt-engine-explorer,可以生成engine的graph。第二个问题,会的。TensorRT PTQ 就是根据“速度优先”原则选择使用的算子类型,如果一个算子支持int8,但是fp16更快,则也会使用fp16。所以有时候PTQ生成的模型并非全是量化算子。详细的请参考https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#intro-quantization
@lippman1125 好的,感谢~另外TensorRT是否可以手动指定某一层使用的算子类型为int8/fp16?
onnx转engine的时候,是推荐使用tensorRT8吗,用tensorRT7.2.1.6一直失败。 [12/12/2022-08:43:04] [E] [TRT] QuantizeLinear_7_quantize_scale_node: shift weights has count 32 but 3 was expected [12/12/2022-08:43:04] [E] [TRT] QuantizeLinear_7_quantize_scale_node: shift weights has count 32 but 3 was expected [12/12/2022-08:43:04] [E] [TRT] QuantizeLinear_7_quantize_scale_node: shift weights has count 32 but 3 was expected While parsing node number 8 [DequantizeLinear]: ERROR: /data/TensorRT/parsers/onnx/builtin_op_importers.cpp:852 In function importDequantizeLinear: [6] Assertion failed: K == scale.count()
Before Asking
[X] I have read the README carefully. 我已经仔细阅读了README上的操作指引。
[X] I want to train my custom dataset, and I have read the tutorials for training your custom data carefully and organize my dataset correctly; (FYI: We recommand you to apply the config files of xx_finetune.py.) 我想训练自定义数据集,我已经仔细阅读了训练自定义数据的教程,以及按照正确的目录结构存放数据集。(FYI: 我们推荐使用xx_finetune.py等配置文件训练自定义数据集。)
[X] I have pulled the latest code of main branch to run again and the problem still existed. 我已经拉取了主分支上最新的代码,重新运行之后,问题仍不能解决。
Search before asking
Question
请问如何确认TensorRT导出engine某一层的weight dtype是INT8还是HALF?如果模型中存在部分INT8不支持的层,TensorRT是否会自动设置混合精度?
Additional
No response