Open myunuro opened 1 week ago
No , I think. @myunuro You need use tensorrt pytorch_quantization toolkit to qat.
Thanks for the quick reply! Is there any equivalent for Tensorflow model? Basically we are thinking of onnx as a converge point for both Pytorch and Tensorflow
Description
I used onnxruntime.quantization.quantize_dynamic to quantize my model, which inserted a bunch of
DynamicQuantizeLinear
into the graph. When I latter use TensorRT Python API to compile it, it says[06/24/2024-22:46:00] [TRT] [E] 3: getPluginCreator could not find plugin: DynamicQuantizeLinear version: 1
.Is there existing plugin for DynamicQuantizeLinear?
Environment
TensorRT Version: 8.6.1.6
NVIDIA GPU: A4000
NVIDIA Driver Version: 535.183.01
CUDA Version: 12.2
CUDNN Version: N/A
Operating System:
Python Version (if applicable):
Tensorflow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if so, version):
Relevant Files
Model link:
Steps To Reproduce
Commands or scripts:
Have you tried the latest release?:
Can this model run on other frameworks? For example run ONNX model with ONNXRuntime (
polygraphy run <model.onnx> --onnxrt
):