NVIDIA / TensorRT

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.
https://developer.nvidia.com/tensorrt
Apache License 2.0
10.66k stars 2.12k forks source link

Polygraphy: How to write the data_loader.py to send the calibration data? #4196

Open Kongsea opened 2 days ago

Kongsea commented 2 days ago

The example data_loader.py file used the fake data. I want to know how to write the file to send image files data to Polygraphy to calibrate the model and improve the accuracy.

Such as the axis, the data range, and so on. The axis is image_num, image_channel, height, width or the other? The data range is [0, 1] or [0, 255]? It should be the same as the pth model input or be stricted to a fixed range?

Thank you for any suggestions or help.

Kongsea commented 2 days ago

Use trtexec --onnx=model.onnx --saveEngine=model.trt--int8 without calibration data to quantize the model can get a trt model to inference and get a low precision image.

However, use polygraphy convert model.onnx --int8 -o model.trt without calibration data to quantize the model can get a trt model whose output is abnormal with very small numbers.

Then I write a data_loader.py to use polygraphy to quantize the onnx model with calibration data, the output is very similar with no calibration data. I was very confused.

def load_data():
    for i, image in enumerate(images):
        img = cv2.imread(image, 0)
        if len(img.shape) == 2:
            img = np.expand_dims(img, axis=2)
        img = (np.transpose(np.ascontiguousarray(np.expand_dims(img, axis=0)), (0, 3, 1, 2))).astype(np.float16)
        yield {
            "input": img
        }