Closed xxueniao closed 2 years ago
Hello @xxueniao , could you try insert Q/DQ pair both before the plugin and after the plugin, after convert to ONNX, using graph surgeon to modify the graph like this:
Thanks!
close since no activity for more than 3 weeks, please reopen if you still have question, thanks!
Description
I wrote a custom plugins to support int8 input. At the same time, I turned on explicit quantization, which means that I can no longer perform PTQ and the output of the layer before this plugin cannot be int8 either. Because this layer cannot be fused with q/dq (e.g. a resize layer). So, how can I make the input of this plugin to be int8 in explicit quantization mode? Do I need to add a Q layer in front of the plugin?
Environment
TensorRT Version: 8.0.0.4 NVIDIA GPU: 3080 NVIDIA Driver Version: 460.32.03 CUDA Version: 11.2 CUDNN Version: 8.1 Operating System: ubuntu16.04 Python Version (if applicable): 3.8 Tensorflow Version (if applicable): 2.4.3 PyTorch Version (if applicable): Baremetal or Container (if so, version):