Tencent / TPAT

TensorRT Plugin Autogen Tool
Apache License 2.0
365 stars 42 forks source link

Conversion Error for IsInf OP #18

Closed debrekXuHan closed 2 years ago

debrekXuHan commented 2 years ago

We converted the ONNX model with IsInf OPs and it succeeded. We noticed that the IsInf OP is implemented by tpat_ininf and Cast OP. When we convert the ONNX to TensorRT model, the error happen as follow:

onnx2trt.py:29: DeprecationWarning: Use set_memory_pool_limit instead. config.max_workspace_size =( 1 << 20 ) 3 1024 Loading ONNX file from path /home/tensorrt/model_testing-sim.onnx... Beginning ONNX file parsing [08/16/2022-10:10:18] [TRT] [W] onnx2trt_utils.cpp:363: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. raw shape of 0 is: (6, 3, 928, 1600) Completed parsing of ONNX file Building an engine from file /home/tensorrt/model_testing-sim.onnx; this may take a while... onnx2trt.py:54: DeprecationWarning: Use build_serialized_network instead. engine = builder.build_engine(network,config) [08/16/2022-10:11:00] [TRT] [E] 1: [castBuilder.cpp::addSupportedFormats::117] Error Code 1: Internal Error (Cast output type does not support bool.) Completed creating Engine Traceback (most recent call last): File "onnx2trt.py", line 57, in f.write(engine.serialize()) AttributeError: 'NoneType' object has no attribute 'serialize'

Do you still have this issue for IsInf OP? How can I solve this issue?

buptqq commented 2 years ago

We converted the ONNX model with IsInf OPs and it succeeded. We noticed that the IsInf OP is implemented by tpat_ininf and Cast OP. When we convert the ONNX to TensorRT model, the error happen as follow:

onnx2trt.py:29: DeprecationWarning: Use set_memory_pool_limit instead. config.max_workspace_size =( 1 << 20 ) 3 1024 Loading ONNX file from path /home/tensorrt/model_testing-sim.onnx... Beginning ONNX file parsing [08/16/2022-10:10:18] [TRT] [W] onnx2trt_utils.cpp:363: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. raw shape of 0 is: (6, 3, 928, 1600) Completed parsing of ONNX file Building an engine from file /home/tensorrt/model_testing-sim.onnx; this may take a while... onnx2trt.py:54: DeprecationWarning: Use build_serialized_network instead. engine = builder.build_engine(network,config) [08/16/2022-10:11:00] [TRT] [E] 1: [castBuilder.cpp::addSupportedFormats::117] Error Code 1: Internal Error (Cast output type does not support bool.) Completed creating Engine Traceback (most recent call last): File "onnx2trt.py", line 57, in f.write(engine.serialize()) AttributeError: 'NoneType' object has no attribute 'serialize'

Do you still have this issue for IsInf OP? How can I solve this issue?

It seems you have auto generate the plugin of Inf succeed(which can find at TPAT/python/trt_plugin/lib), but TensorRT not allowed the type of 'INT64'. So can you instead the 'INT64' in Inf operator of 'INT32'?

debrekXuHan commented 2 years ago

We converted the ONNX model with IsInf OPs and it succeeded. We noticed that the IsInf OP is implemented by tpat_ininf and Cast OP. When we convert the ONNX to TensorRT model, the error happen as follow: onnx2trt.py:29: DeprecationWarning: Use set_memory_pool_limit instead. config.max_workspace_size =( 1 << 20 ) 3 1024 Loading ONNX file from path /home/tensorrt/model_testing-sim.onnx... Beginning ONNX file parsing [08/16/2022-10:10:18] [TRT] [W] onnx2trt_utils.cpp:363: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. raw shape of 0 is: (6, 3, 928, 1600) Completed parsing of ONNX file Building an engine from file /home/tensorrt/model_testing-sim.onnx; this may take a while... onnx2trt.py:54: DeprecationWarning: Use build_serialized_network instead. engine = builder.build_engine(network,config) [08/16/2022-10:11:00] [TRT] [E] 1: [castBuilder.cpp::addSupportedFormats::117] Error Code 1: Internal Error (Cast output type does not support bool.) Completed creating Engine Traceback (most recent call last): File "onnx2trt.py", line 57, in f.write(engine.serialize()) AttributeError: 'NoneType' object has no attribute 'serialize' Do you still have this issue for IsInf OP? How can I solve this issue?

It seems you have auto generate the plugin of Inf succeed(which can find at TPAT/python/trt_plugin/lib), but TensorRT not allowed the type of 'INT64'. So can you instead the 'INT64' in Inf operator of 'INT32'?

Thanks for asking. It seems in model ONNX model, IsInf OP input is float64. How can I instead the 'INT64' in IsInf operator of 'INT32'?

debrekXuHan commented 2 years ago

I modified the IsInf OP in ONNX model with float32 input. Currently, IsInf OP is transferred to tpatIsInf* (input: float32) and Cast (output: boolean). I believe the error happened at the Cast OP.

buptqq commented 2 years ago

I have try to auto-generate plugin for IsInf operator. the tensorflow code like this:

input_ph_1 = tf.placeholder(dtype=tf.float32, shape=[1, 2], name='input_1') 
input_data_1 = np.array([[1.0, np.inf]]) 
x = tf.math.is_inf(input_ph_1, name=op_name) 
output = tf.identity(x, name="output")

and then use tf2onnx convert tensorflow pb to onnx model. the result of plugin is:

0 input is [ 1. inf]
[08/22/2022-13:10:53] [TRT] [V] myelinAllocCb allocated GPU 130 bytes at 0x7f2b3fe48a00.
[08/22/2022-13:10:53] [TRT] [V] myelinAllocCb allocated GPU 132 bytes at 0x7f2b3fe48a00.
[08/22/2022-13:10:53] [TRT] [V] myelinAllocCb allocated GPU 136 bytes at 0x7f2b3fe48a00.
[08/22/2022-13:10:53] [TRT] [V] myelinAllocCb allocated GPU 144 bytes at 0x7f2b3fe48a00.
[08/22/2022-13:10:53] [TRT] [V] myelinAllocCb allocated GPU 160 bytes at 0x7f2b3fe48a00.
[08/22/2022-13:10:53] [TRT] [V] myelinAllocCb allocated GPU 320 bytes at 0x7f2b3fe48a00.
[08/22/2022-13:10:53] [TRT] [V] myelinAllocCb allocated GPU 640 bytes at 0x7f2b3fe48a00.
[08/22/2022-13:10:53] [TRT] [V] myelinAllocCb allocated GPU 1280 bytes at 0x7f2b3fe48a00.
[08/22/2022-13:10:53] [TRT] [V] myelinAllocCb allocated GPU 2560 bytes at 0x7f2b3fe48a00.
[False  True] #tf result
[False  True] # trt result

So can you provide your onnx model? I try to slove your problem.

debrekXuHan commented 2 years ago

Thanks a lot for your help! But I found the ONNX model file cannot be uploaded here... Basically, we only have one IsInf OP in the ONNX model. We generated ONNX model by Pytorch:

import torch
import pytorch_lightning as pl

class SimpleModel(pl.LightningModule):
     def __init__(self):
        super().__init__()
        self.l1 = torch.nn.Linear(in_features=64, out_features=4)

    def forward(self, x):
        # y = self.l1(x.view(x.size(0), -1))
        y = torch.isinf(x)
        return y

filepath = "model.onnx"
model = SimpleModel()
input_sample = torch.randn((1, 64))
model.to_onnx(filepath, input_sample, opset_version=13, export_params=True)

Then we generated plugin by:

python onnx_to_plugin.py -i /home/dms/Codes/xuhan/DETR3D/model.onnx -o /home/dms/Codes/xuhan/DETR3D/model-tpat.onnx -t IsInf

We could see the generated new onnx file, plugin.so, .cu and .h files. And then the error happens when we try to convert ONNX model into TensorRT model.

#!/usr/bin/env python
#-*- coding: utf-8 -*-
import sys

import os
import ctypes
import argparse
import tensorrt as trt

new_lib_path = "/home/dms/TRTplugin/TPAT/python/trt_plugin/lib/tpat_IsInf_0.so"
onnx_path = "/home/dms/Codes/xuhan/DETR3D/model-tpat.onnx"
engine_paht = "/home/dms/Codes/xuhan/DETR3D/model-tpat.engine"

TRT_LOGGER = trt.Logger(trt.Logger.WARNING)
BATCH_SIZE = 1
EXPLICIT_BATCH = 1 << (int)(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH)
if __name__ == "__main__":
    print('get start')

    ctypes.cdll.LoadLibrary(new_lib_path)

    with trt.Builder(TRT_LOGGER) as builder, builder.create_network(EXPLICIT_BATCH) as network, trt.OnnxParser(network, TRT_LOGGER) as parser:
        builder.max_batch_size = BATCH_SIZE
        builder_config = builder.create_builder_config()
        builder_config.max_workspace_size = 1 << 30

        with open(onnx_path, "rb") as model:
            # parse onnx model
            parser.parse(model.read())
            for i in range(parser.num_errors):
                print(parser.get_error(i))
        engine = builder.build_engine(network, builder_config)
        print("Completed creating Engine")
        with open(engine_paht, "wb") as f:
            f.write(engine.serialize())

I am not sure if there is any problem in this process.

buptqq commented 2 years ago

Thanks a lot for your help! But I found the ONNX model file cannot be uploaded here... Basically, we only have one IsInf OP in the ONNX model. We generated ONNX model by Pytorch:

import torch
import pytorch_lightning as pl

class SimpleModel(pl.LightningModule):
     def __init__(self):
        super().__init__()
        self.l1 = torch.nn.Linear(in_features=64, out_features=4)

    def forward(self, x):
        # y = self.l1(x.view(x.size(0), -1))
        y = torch.isinf(x)
        return y

filepath = "model.onnx"
model = SimpleModel()
input_sample = torch.randn((1, 64))
model.to_onnx(filepath, input_sample, opset_version=13, export_params=True)

Then we generated plugin by:

python onnx_to_plugin.py -i /home/dms/Codes/xuhan/DETR3D/model.onnx -o /home/dms/Codes/xuhan/DETR3D/model-tpat.onnx -t IsInf

We could see the generated new onnx file, plugin.so, .cu and .h files. And then the error happens when we try to convert ONNX model into TensorRT model.

#!/usr/bin/env python
#-*- coding: utf-8 -*-
import sys

import os
import ctypes
import argparse
import tensorrt as trt

new_lib_path = "/home/dms/TRTplugin/TPAT/python/trt_plugin/lib/tpat_IsInf_0.so"
onnx_path = "/home/dms/Codes/xuhan/DETR3D/model-tpat.onnx"
engine_paht = "/home/dms/Codes/xuhan/DETR3D/model-tpat.engine"

TRT_LOGGER = trt.Logger(trt.Logger.WARNING)
BATCH_SIZE = 1
EXPLICIT_BATCH = 1 << (int)(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH)
if __name__ == "__main__":
    print('get start')

    ctypes.cdll.LoadLibrary(new_lib_path)

    with trt.Builder(TRT_LOGGER) as builder, builder.create_network(EXPLICIT_BATCH) as network, trt.OnnxParser(network, TRT_LOGGER) as parser:
        builder.max_batch_size = BATCH_SIZE
        builder_config = builder.create_builder_config()
        builder_config.max_workspace_size = 1 << 30

        with open(onnx_path, "rb") as model:
            # parse onnx model
            parser.parse(model.read())
            for i in range(parser.num_errors):
                print(parser.get_error(i))
        engine = builder.build_engine(network, builder_config)
        print("Completed creating Engine")
        with open(engine_paht, "wb") as f:
            f.write(engine.serialize())

I am not sure if there is any problem in this process.

hi, this process is right..And I have generated the engine of tensorrt. Run :

python torch_trt_inf.py --onnx_path=./model_inf_trt.onnx --trt_path=inf.gie

And get.

get start
Loading ONNX file from path ./model_inf_trt.onnx...
Beginning ONNX file parsing
[08/22/2022-16:43:21] [TRT] [W] Tensor DataType is determined at build time for tensors not marked as input or output.
[08/22/2022-16:43:21] [TRT] [W] Tensor DataType is determined at build time for tensors not marked as input or output.
raw shape of 0 is:  (1, 64)
Completed parsing of ONNX file
Building an engine from file ./model_inf_trt.onnx; this may take a while...
[08/22/2022-16:43:22] [TRT] [W] TensorRT was linked against cuBLAS/cuBLAS LT 11.2.0 but loaded cuBLAS/cuBLAS LT 11.1.0
[08/22/2022-16:43:22] [TRT] [W] TensorRT was linked against cuDNN 8.2.0 but loaded cuDNN 8.1.1
[08/22/2022-16:43:22] [TRT] [W] Detected invalid timing cache, setup a local cache instead
[08/22/2022-16:43:23] [TRT] [W] TensorRT was linked against cuBLAS/cuBLAS LT 11.2.0 but loaded cuBLAS/cuBLAS LT 11.1.0
[08/22/2022-16:43:23] [TRT] [W] TensorRT was linked against cuDNN 8.2.0 but loaded cuDNN 8.1.1
Completed creating Engine
debrekXuHan commented 2 years ago

I still got the error as follow. So, what is the tensorrt version you are using?

get start
onnx2trt.py:31: DeprecationWarning: Use set_memory_pool_limit instead.
Loading ONNX file from path /home/dms/Codes/xuhan/DETR3D/model-tpat.onnx...
Beginning ONNX file parsing
raw shape of 0 is:  (1, 64)
Completed parsing of ONNX file
Building an engine from file /home/dms/Codes/xuhan/DETR3D/model-tpat.onnx; this may take a while...
onnx2trt.py:54: DeprecationWarning: Use build_serialized_network instead.
[08/22/2022-17:11:55] [TRT] [E] 1: [castBuilder.cpp::addSupportedFormats::117] Error Code 1: Internal Error (Cast output type does not support bool.)
Completed creating Engine
Traceback (most recent call last):
  File "onnx2trt.py", line 57, in <module>
AttributeError: 'NoneType' object has no attribute 'serialize'
buptqq commented 2 years ago

I still got the error as follow. So, what is the tensorrt version you are using?

get start
onnx2trt.py:31: DeprecationWarning: Use set_memory_pool_limit instead.
Loading ONNX file from path /home/dms/Codes/xuhan/DETR3D/model-tpat.onnx...
Beginning ONNX file parsing
raw shape of 0 is:  (1, 64)
Completed parsing of ONNX file
Building an engine from file /home/dms/Codes/xuhan/DETR3D/model-tpat.onnx; this may take a while...
onnx2trt.py:54: DeprecationWarning: Use build_serialized_network instead.
[08/22/2022-17:11:55] [TRT] [E] 1: [castBuilder.cpp::addSupportedFormats::117] Error Code 1: Internal Error (Cast output type does not support bool.)
Completed creating Engine
Traceback (most recent call last):
  File "onnx2trt.py", line 57, in <module>
AttributeError: 'NoneType' object has no attribute 'serialize'

tensorRT-8.0.1.6 Images which build from Dockerfile(instead base image of nvcr.io/nvidia/tensorflow:21.08-tf1-py3):

https://github.com/Tencent/TPAT/blob/main/Dockerfile
debrekXuHan commented 2 years ago

Thanks a lot for you answering my issue!

We build TAPT with source codes on NVIDIA AGX environment with TensorRT 8.4.0. I am not sure if it is the version cause the problem.

debrekXuHan commented 2 years ago

We had some problem upload the files here. I converted the ONNX with single IsInf OP and the generated ONNX Nodes info are shown below. It seems the problem happens for Cast OP, in which the Inputs shape & dtype are None. It may not be recognized by TensorRT.

How can I solve this issue? Could you share your new ONNX model for IsInf OP here?

Graph torch-jit-export (Opset: 10)
Inputs: [Variable (0): (shape=[1, 64], dtype=float32)]
Nodes:
IsInf_0 (tpat_IsInf_0)
        Inputs: [
                Variable (0): (shape=[1, 64], dtype=float32)
        ]
        Outputs: [
                Variable (cast_back_for_3:0): (shape=None, dtype=None)
        ]
cast_back_for_3 (Cast)
        Inputs: [
                Variable (cast_back_for_3:0): (shape=None, dtype=None)
        ]
        Outputs: [
                Variable (3): (shape=[1, 64], dtype=bool)
        ]
Attributes: {'to': 9}
Outputs: [Variable (3): (shape=[1, 64], dtype=bool)]
debrekXuHan commented 2 years ago

Solved it by modifying the onnx_modified.py file to add another Cast layer to firstly cast the output of tpat_inf into int32, and then cast into bool. Anyway, I appreciate a lot for your help! @buptqq

Graph after inserting Cast nodes Graph torch-jit-export (Opset: 10)
Inputs: [Variable (0): (shape=[1, 64], dtype=float32)]
Nodes:
IsInf_0 (tpat_IsInf_0)
        Inputs: [
                Variable (0): (shape=[1, 64], dtype=float32)
        ]
        Outputs: [
                Variable (cast_back_0_for_3:0): (shape=None, dtype=None)
        ]
cast_back_0_for_3 (Cast)
        Inputs: [
                Variable (cast_back_0_for_3:0): (shape=None, dtype=None)
        ]
        Outputs: [
                Variable (cast_back_0_for_3:1): (shape=[1, 64], dtype=<class 'numpy.int32'>)
        ]
Attributes: {'to': 6}
cast_back_1_for_3 (Cast)
        Inputs: [
                Variable (cast_back_0_for_3:1): (shape=[1, 64], dtype=<class 'numpy.int32'>)
        ]
        Outputs: [
                Variable (3): (shape=[1, 64], dtype=bool)
        ]
Attributes: {'to': 9}
Outputs: [Variable (3): (shape=[1, 64], dtype=bool)]