Failed to convert depth_net model to TensorRT model

XinnWang commented 4 years ago

Hi, i noticed that in your CVPR paper, you said:

an inference time of 60ms on a Titan V100 GPU, which can be further improved to < 30ms using TensorRT

So i try to use TensorRT to improve the performance:

Using the following code to export onnx model:

dummy_input = torch.randn(1, 3, 192, 640, device='cuda')
torch.onnx.export(model_wrapper.model.depth_net, dummy_input, "/my_folder/dump.onnx",opset_version=11)

Using the convert tool provided by TensorRT to convert onnx to trt model:

./trtexec --onnx=/my_folder/dump.onnx --saveEngine=dump.trt

However, step 2 failed, the log info is below:

[07/23/2020-17:14:21] [I] === Model Options ===
[07/23/2020-17:14:21] [I] Format: ONNX
[07/23/2020-17:14:21] [I] Model: /data/exp_results/complex_model/dump.onnx
[07/23/2020-17:14:21] [I] Output:
[07/23/2020-17:14:21] [I] === Build Options ===
[07/23/2020-17:14:21] [I] Max batch: 1
[07/23/2020-17:14:21] [I] Workspace: 16 MB
[07/23/2020-17:14:21] [I] minTiming: 1
[07/23/2020-17:14:21] [I] avgTiming: 8
[07/23/2020-17:14:21] [I] Precision: FP32
[07/23/2020-17:14:21] [I] Calibration: 
[07/23/2020-17:14:21] [I] Safe mode: Disabled
[07/23/2020-17:14:21] [I] Save engine: dump.trt
[07/23/2020-17:14:21] [I] Load engine: 
[07/23/2020-17:14:21] [I] Builder Cache: Enabled
[07/23/2020-17:14:21] [I] NVTX verbosity: 0
[07/23/2020-17:14:21] [I] Inputs format: fp32:CHW
[07/23/2020-17:14:21] [I] Outputs format: fp32:CHW
[07/23/2020-17:14:21] [I] Input build shapes: model
[07/23/2020-17:14:21] [I] Input calibration shapes: model
[07/23/2020-17:14:21] [I] === System Options ===
[07/23/2020-17:14:21] [I] Device: 0
[07/23/2020-17:14:21] [I] DLACore: 
[07/23/2020-17:14:21] [I] Plugins:
[07/23/2020-17:14:21] [I] === Inference Options ===
[07/23/2020-17:14:21] [I] Batch: 1
[07/23/2020-17:14:21] [I] Input inference shapes: model
[07/23/2020-17:14:21] [I] Iterations: 10
[07/23/2020-17:14:21] [I] Duration: 3s (+ 200ms warm up)
[07/23/2020-17:14:21] [I] Sleep time: 0ms
[07/23/2020-17:14:21] [I] Streams: 1
[07/23/2020-17:14:21] [I] ExposeDMA: Disabled
[07/23/2020-17:14:21] [I] Spin-wait: Disabled
[07/23/2020-17:14:21] [I] Multithreading: Disabled
[07/23/2020-17:14:21] [I] CUDA Graph: Disabled
[07/23/2020-17:14:21] [I] Skip inference: Disabled
[07/23/2020-17:14:21] [I] Inputs:
[07/23/2020-17:14:21] [I] === Reporting Options ===
[07/23/2020-17:14:21] [I] Verbose: Disabled
[07/23/2020-17:14:21] [I] Averages: 10 inferences
[07/23/2020-17:14:21] [I] Percentile: 99
[07/23/2020-17:14:21] [I] Dump output: Disabled
[07/23/2020-17:14:21] [I] Profile: Disabled
[07/23/2020-17:14:21] [I] Export timing to JSON file: 
[07/23/2020-17:14:21] [I] Export output to JSON file: 
[07/23/2020-17:14:21] [I] Export profile to JSON file: 
[07/23/2020-17:14:21] [I] 
----------------------------------------------------------------
Input filename:   /data/exp_results/complex_model/dump.onnx
ONNX IR version:  0.0.4
Opset version:    11
Producer name:    pytorch
Producer version: 1.3
Domain:           
Model version:    0
Doc string:       
----------------------------------------------------------------
[07/23/2020-17:14:23] [W] [TRT] onnx2trt_utils.cpp:220: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[07/23/2020-17:14:23] [W] [TRT] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
ERROR: builtin_op_importers.cpp:2179 In function importPad:
[8] Assertion failed: inputs.at(1).is_weights()
[07/23/2020-17:14:23] [E] Failed to parse onnx file
[07/23/2020-17:14:23] [E] Parsing model failed
[07/23/2020-17:14:23] [E] Engine creation failed
[07/23/2020-17:14:23] [E] Engine set up failed

i tried the DepthResNet and PackNet01 model, both failed at this ERROR:

ERROR: builtin_op_importers.cpp:2179 In function importPad:
[8] Assertion failed: inputs.at(1).is_weights()

Have you met this error before? Any suggestions? Thanks.

VitorGuizilini-TRI commented 4 years ago

The NVIDIA website has an example on how to convert PackNet to TensorRT, you can have a look here:

https://docs.nvidia.com/deeplearning/tensorrt/sample-support-guide/index.html#onnx_packnet

XinnWang commented 4 years ago

Thanks!

XinnWang commented 4 years ago

Hi, i tried the example, it works well. I am trying to increase the inference speed by INT8 mode, but the results of INT8 mode is wrong. Have you ever tried INT8 inference ? any suggestions? Thanks.

VitorGuizilini-TRI commented 4 years ago

We still haven't tried INT8 inference, so unfortunately I cannot help you with that. I will leave this issue open for now, maybe someone else can provide more information.

TRI-ML / packnet-sfm

Failed to convert depth_net model to TensorRT model #52