Closed woonwoon closed 1 year ago
You can call the setFlag API to use fp16 or int8(int8 requires calibration). Here is a c++ code, you can find the corresponding python api in tensorrt doc.
I just want to quantization resnet in classification problems for each data type(FP32, FP16, Int8). in python code https://github.com/wang-xinyu/tensorrtx/blob/master/resnet/resnet50.py
You need to call the config.setflag(FP16) API before calling build_engine(), the tensort build_engine() function will do the quantization internally.
The .wts contains the FP32 weights, the tensort build_engine() function will do the quantization internally, you don't need to convert the weights by yourself, just need to setFlag.
Thanks to that, I converted it to float16. However, the conversion to int8 requires an calibration, but I don't know how to add it. I'd appreciate it if you could tell me
i add this : config.set_flag(trt.BuilderFlag.INT8)
error message : [TensorRT] WARNING: Calibrator is not being used. Users must provide dynamic range for all tensors that are not Int32. [TensorRT] ERROR: Calibration failure occurred with no scaling factors detected. This could be due to no int8 calibrator or insufficient custom scales for network layers. Please see int8 sample to setup calibration correctly.
TensorRT INT8 PTQ requires calibration, it's a bit tricky to implement, you can refer to the yolov5 in this repo.
In addition to the setFlag(INT8), basically you need to implement a Calibration class. You can also check out the NVIDIA TensorRT doc for this.
I can quantizate to int8, fp16. thank you. There is one question. If I run without adding 'set_flag', will it be FP32 -> FP32? So what is done?
Yes, it will use FP32 by default.
It's not quantized, is it?
It's not quantized
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
I want resnet50.wts quantization to fp32, fp16, int8 at resnet50.py. How do I modify resnet50.py to do each?