meituan / YOLOv6

YOLOv6: a single-stage object detection framework dedicated to industrial applications.
GNU General Public License v3.0
5.72k stars 1.03k forks source link

请问如何确认TensorRT导出engine某一层的weight dtype是INT8还是HALF?如果模型中存在部分INT8不支持的层,TensorRT是否会自动设置混合精度? #581

Open songkq opened 2 years ago

songkq commented 2 years ago

Before Asking

Search before asking

Question

请问如何确认TensorRT导出engine某一层的weight dtype是INT8还是HALF?如果模型中存在部分INT8不支持的层,TensorRT是否会自动设置混合精度?

Additional

No response

lippman1125 commented 2 years ago

@songkq 首先这是2个问题,第一个问题,如果想确认TensorRT engine导出某一层的类型,请使用工具https://github.com/NVIDIA/TensorRT/tree/main/tools/experimental/trt-engine-explorer,可以生成engine的graph。第二个问题,会的。TensorRT PTQ 就是根据“速度优先”原则选择使用的算子类型,如果一个算子支持int8,但是fp16更快,则也会使用fp16。所以有时候PTQ生成的模型并非全是量化算子。详细的请参考https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#intro-quantization

songkq commented 2 years ago

@songkq 首先这是2个问题,第一个问题,如果想确认TensorRT engine导出某一层的类型,请使用工具https://github.com/NVIDIA/TensorRT/tree/main/tools/experimental/trt-engine-explorer,可以生成engine的graph。第二个问题,会的。TensorRT PTQ 就是根据“速度优先”原则选择使用的算子类型,如果一个算子支持int8,但是fp16更快,则也会使用fp16。所以有时候PTQ生成的模型并非全是量化算子。详细的请参考https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#intro-quantization

@lippman1125 好的,感谢~另外TensorRT是否可以手动指定某一层使用的算子类型为int8/fp16?

lippman1125 commented 2 years ago

@songkq 可以的,请参考https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#layer-level-control

zahidzqj commented 1 year ago

onnx转engine的时候,是推荐使用tensorRT8吗,用tensorRT7.2.1.6一直失败。 [12/12/2022-08:43:04] [E] [TRT] QuantizeLinear_7_quantize_scale_node: shift weights has count 32 but 3 was expected [12/12/2022-08:43:04] [E] [TRT] QuantizeLinear_7_quantize_scale_node: shift weights has count 32 but 3 was expected [12/12/2022-08:43:04] [E] [TRT] QuantizeLinear_7_quantize_scale_node: shift weights has count 32 but 3 was expected While parsing node number 8 [DequantizeLinear]: ERROR: /data/TensorRT/parsers/onnx/builtin_op_importers.cpp:852 In function importDequantizeLinear: [6] Assertion failed: K == scale.count()