Closed pangr closed 2 years ago
@ttyio Do you have any recommendations on the QDQ placement here? I think the user can fine-tune it to get better performance.
@pangr , what's the op after the add
, also have you tried insert Q/DQ after the add
? thanks
Closing since no activity for more than 3 weeks, please reopen if you still have question, thanks!
Description
Environment
TensorRT Version: 8.4.1.5 NVIDIA GPU: 1080ti NVIDIA Driver Version: 450 CUDA Version: 11.0 CUDNN Version: 8.1.0 Operating System: Python Version (if applicable): Tensorflow Version (if applicable): PyTorch Version (if applicable): Baremetal or Container (if so, version):
Relevant Files
Steps To Reproduce
When I use PTQ, 'PWN(PWN(Sigmoid, Mul), Add)' is 'Int8' -> 'Int8': But when I use QAT, 'PWN(PWN(Sigmoid, Mul), Add)' is 'Int8' -> 'Float32':
Int8 onnx is: