NVIDIA / TensorRT

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.
https://developer.nvidia.com/tensorrt
Apache License 2.0
10.56k stars 2.1k forks source link

Could not find any implementation for node SandGlass1.0.weight in Tensorrt 10.0.1.6 #3870

Closed shisheng111 closed 2 months ago

shisheng111 commented 4 months ago

I seemed to have a bug when using tensorrt10.0.1.6. When converting to tensorrt model in the last step, I could not find the quantizer node, is it because I used a custom nonlinear loss function:MISH? Here is my code and model。

import os
import tensorrt as trt
from calibration import Calibrator
import pycuda.driver as cuda
import pycuda.autoinit

TRT_LOGGER = trt.Logger(trt.Logger.VERBOSE)
onnx_model_path = r"D:\Python_DL\venv\Experimental_result\mobilenetv2\mobilenetv2_SQ.onnx"

def get_engine(onnx_file_path="", engine_file_path="", calibrator=None, save_engine=False):
    with trt.Builder(TRT_LOGGER) as builder, \
            builder.create_builder_config() as config, \
            builder.create_network(1) as network, \
            trt.Runtime(TRT_LOGGER) as runtime, \
            trt.OnnxParser(network, TRT_LOGGER) as parser:

        if not os.path.exists(onnx_file_path):
            quit('ONNX file {} not found'.format(onnx_file_path))
        print('Loading ONNX file from path {}...'.format(onnx_file_path))
        with open(onnx_file_path, 'rb') as model:
            print('Beginning ONNX file parsing')
            parser.parse(model.read())
            assert network.num_layers > 0, 'Failed to parse ONNX model. \
                        Please check if the ONNX model is compatible '
        print('Completed parsing of ONNX file')
        print('Building an engine from file {}; this may take a while...'.format(onnx_file_path))

        config.set_flag(trt.BuilderFlag.INT8)
        config.int8_calibrator = calibrator
        print('Int8 mode enabled')
        plan = builder.build_serialized_network(network, config)
        if plan is None:
            print('Failed to create the engine')
            return None
        print("Completed creating the engine")
        engine = runtime.deserialize_cuda_engine(plan)
        if save_engine:
            with open(engine_file_path, "wb") as f:
                f.write(engine.serialize())
        return engine

def run_int8_quantization():
    print('*** onnx to tensorrt int8 engine ***')
    engine_model_path = "mobilenet_SQ_int8.engine"
    runtime_engine = get_engine(onnx_model_path, engine_model_path, save_engine=True)
    assert runtime_engine, 'failed engine generation...'
    print('*** success to generate INT8 engine file ***\n')

if __name__ == '__main__':
    run_int8_quantization()

this is onnxmodel mobilenetv2_SQ.zip

shisheng111 commented 4 months ago

loss function:MISH code is

class Mish(torch.nn.Module):
    def __init__(self):
        super().__init__()

    def forward(self, x):
        x = x * torch.tanh(F.softplus(x))
        return x
zerollzeng commented 4 months ago

Filed internal bug 4657387 to track this, thanks for reporting this :-)

nvpohanh commented 4 months ago

Screenshot 2024-05-20 104120

@shisheng111 Could you try a workaround by inserting Q/DQ ops between the BatchNorm and the activation functions?

We will add a proper fix in TRT in future versions.

zerollzeng commented 2 months ago

Fixed in 10.2. closed