Calibration INT8 with `trtexec`

NVIDIA / TensorRT

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.

https://developer.nvidia.com/tensorrt

Apache License 2.0

10.53k stars 2.1k forks source link

Calibration INT8 with `trtexec` #4044

Closed Egorundel closed 1 month ago

Egorundel commented 1 month ago

Description

Hello!

Is there any way to use trtexec to create a calibration_data.cache calibration file and create an engine? For example, somehow submit a folder with images to the trtexec command.

Environment

TensorRT Version: 8.6.1

NVIDIA GPU: RTX3060

NVIDIA Driver Version: 555

CUDA Version: 11.1

CUDNN Version: 8.0.6

Operating System:

Python Version (if applicable): 3.8

lix19937 commented 1 month ago

trtexec can load calibration_data.cache by --calib=<file>, if you want to trtexec generate it, you can modify the source to support this feature.

Egorundel commented 1 month ago

@lix19937 trtexec can load calibration_data.cache, but he doesn't know how to generate it yet, right?

lix19937 commented 1 month ago

It not support read data to calibration, need user to develop this feature.

lix19937 commented 1 month ago

trtexec can load calibration_data.cache, but if you want to get calibration_data.cache, you should develop or modify the oss of trtexec.

Egorundel commented 1 month ago

@lix19937 I started writing calibration code in C++.

I have a question: And what does it even need to be calibrated?

1. PyTorch model after training? — model.pt 2. The ONNX model? — model.onnx 3. Or is it the TensorRT Engine itself? — model.(trt/engine)?

lix19937 commented 1 month ago

For 1,2 If you do ptq, use onnx file.

For 3 onnx to plan(engine) through trt calibration.

Egorundel commented 1 month ago

For 3 onnx to plan(engine) through trt calibration.

This is only possible if I already have the calibration_data.cache calibration file, right? If I do not have a calibration file, then do I need to create it by calibrating the ONNX model?

Egorundel commented 1 month ago

@lix19937 Can you help me solve the problem in my efforts? I will be very grateful to you.

https://github.com/NVIDIA/TensorRT/issues/4053

lix19937 commented 1 month ago

Most likely, the calibration program was written incorrectly, You can ref https://github.com/lix19937/trt-samples-for-hackathon-cn/tree/master/cookbook/03-BuildEngineByTensorRTAPI/MNISTExample-pyTorch/C%2B%2B

ttyio commented 1 month ago

@Egorundel , fyi polygraphy support dump the calibration cache using run --calibration-cache <file_path>

https://github.com/NVIDIA/TensorRT/tree/release/10.2/tools/Polygraphy#command-line-toolkit

Egorundel commented 1 month ago

@lix19937 Thanks for your help!!!

Can you explain what Npy and Npz arrays do? Can I replace them with just reading images from image paths that are written in a txt file?

lix19937 commented 1 month ago

npz just means https://github.com/lix19937/trt-samples-for-hackathon-cn/blob/master/cookbook/03-BuildEngineByTensorRTAPI/MNISTExample-pyTorch/C%2B%2B/createCalibrationAndInferenceData.py#L39

save calib data to numpy format file.

So you modify https://github.com/lix19937/trt-samples-for-hackathon-cn/blob/master/cookbook/03-BuildEngineByTensorRTAPI/MNISTExample-pyTorch/C%2B%2B/createCalibrationAndInferenceData.py#L25-L27 , use your data path, to create npz data.

cnpy is just read npz data. https://github.com/lix19937/trt-samples-for-hackathon-cn/blob/master/cookbook/03-BuildEngineByTensorRTAPI/MNISTExample-pyTorch/C%2B%2B/cnpy.cpp

Egorundel commented 1 month ago

@lix19937 I reworked my code (C++) and now it works correctly. I used nvinfer1::IInt8EntropyCalibrator2.

https://github.com/Egorundel/int8_calibrator_cpp

You can take it and use it, and also integrate my solution into any of yours in C++.

lix19937 commented 4 weeks ago

Good.