TRI-ML / packnet-sfm

TRI-ML Monocular Depth Estimation Repository
https://tri-ml.github.io/packnet-sfm/
MIT License
1.24k stars 243 forks source link

Failed to convert depth_net model to TensorRT model #52

Closed XinnWang closed 4 years ago

XinnWang commented 4 years ago

Hi, i noticed that in your CVPR paper, you said:

an inference time of 60ms on a Titan V100 GPU, which can be further improved to < 30ms using TensorRT

So i try to use TensorRT to improve the performance:

  1. Using the following code to export onnx model:
    dummy_input = torch.randn(1, 3, 192, 640, device='cuda')
    torch.onnx.export(model_wrapper.model.depth_net, dummy_input, "/my_folder/dump.onnx",opset_version=11)
  2. Using the convert tool provided by TensorRT to convert onnx to trt model:
    ./trtexec --onnx=/my_folder/dump.onnx --saveEngine=dump.trt

    However, step 2 failed, the log info is below:

    [07/23/2020-17:14:21] [I] === Model Options ===
    [07/23/2020-17:14:21] [I] Format: ONNX
    [07/23/2020-17:14:21] [I] Model: /data/exp_results/complex_model/dump.onnx
    [07/23/2020-17:14:21] [I] Output:
    [07/23/2020-17:14:21] [I] === Build Options ===
    [07/23/2020-17:14:21] [I] Max batch: 1
    [07/23/2020-17:14:21] [I] Workspace: 16 MB
    [07/23/2020-17:14:21] [I] minTiming: 1
    [07/23/2020-17:14:21] [I] avgTiming: 8
    [07/23/2020-17:14:21] [I] Precision: FP32
    [07/23/2020-17:14:21] [I] Calibration: 
    [07/23/2020-17:14:21] [I] Safe mode: Disabled
    [07/23/2020-17:14:21] [I] Save engine: dump.trt
    [07/23/2020-17:14:21] [I] Load engine: 
    [07/23/2020-17:14:21] [I] Builder Cache: Enabled
    [07/23/2020-17:14:21] [I] NVTX verbosity: 0
    [07/23/2020-17:14:21] [I] Inputs format: fp32:CHW
    [07/23/2020-17:14:21] [I] Outputs format: fp32:CHW
    [07/23/2020-17:14:21] [I] Input build shapes: model
    [07/23/2020-17:14:21] [I] Input calibration shapes: model
    [07/23/2020-17:14:21] [I] === System Options ===
    [07/23/2020-17:14:21] [I] Device: 0
    [07/23/2020-17:14:21] [I] DLACore: 
    [07/23/2020-17:14:21] [I] Plugins:
    [07/23/2020-17:14:21] [I] === Inference Options ===
    [07/23/2020-17:14:21] [I] Batch: 1
    [07/23/2020-17:14:21] [I] Input inference shapes: model
    [07/23/2020-17:14:21] [I] Iterations: 10
    [07/23/2020-17:14:21] [I] Duration: 3s (+ 200ms warm up)
    [07/23/2020-17:14:21] [I] Sleep time: 0ms
    [07/23/2020-17:14:21] [I] Streams: 1
    [07/23/2020-17:14:21] [I] ExposeDMA: Disabled
    [07/23/2020-17:14:21] [I] Spin-wait: Disabled
    [07/23/2020-17:14:21] [I] Multithreading: Disabled
    [07/23/2020-17:14:21] [I] CUDA Graph: Disabled
    [07/23/2020-17:14:21] [I] Skip inference: Disabled
    [07/23/2020-17:14:21] [I] Inputs:
    [07/23/2020-17:14:21] [I] === Reporting Options ===
    [07/23/2020-17:14:21] [I] Verbose: Disabled
    [07/23/2020-17:14:21] [I] Averages: 10 inferences
    [07/23/2020-17:14:21] [I] Percentile: 99
    [07/23/2020-17:14:21] [I] Dump output: Disabled
    [07/23/2020-17:14:21] [I] Profile: Disabled
    [07/23/2020-17:14:21] [I] Export timing to JSON file: 
    [07/23/2020-17:14:21] [I] Export output to JSON file: 
    [07/23/2020-17:14:21] [I] Export profile to JSON file: 
    [07/23/2020-17:14:21] [I] 
    ----------------------------------------------------------------
    Input filename:   /data/exp_results/complex_model/dump.onnx
    ONNX IR version:  0.0.4
    Opset version:    11
    Producer name:    pytorch
    Producer version: 1.3
    Domain:           
    Model version:    0
    Doc string:       
    ----------------------------------------------------------------
    [07/23/2020-17:14:23] [W] [TRT] onnx2trt_utils.cpp:220: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
    [07/23/2020-17:14:23] [W] [TRT] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
    ERROR: builtin_op_importers.cpp:2179 In function importPad:
    [8] Assertion failed: inputs.at(1).is_weights()
    [07/23/2020-17:14:23] [E] Failed to parse onnx file
    [07/23/2020-17:14:23] [E] Parsing model failed
    [07/23/2020-17:14:23] [E] Engine creation failed
    [07/23/2020-17:14:23] [E] Engine set up failed

    i tried the DepthResNet and PackNet01 model, both failed at this ERROR:

    ERROR: builtin_op_importers.cpp:2179 In function importPad:
    [8] Assertion failed: inputs.at(1).is_weights()

    Have you met this error before? Any suggestions? Thanks.

VitorGuizilini-TRI commented 4 years ago

The NVIDIA website has an example on how to convert PackNet to TensorRT, you can have a look here:

https://docs.nvidia.com/deeplearning/tensorrt/sample-support-guide/index.html#onnx_packnet

XinnWang commented 4 years ago

Thanks!

XinnWang commented 4 years ago

Hi, i tried the example, it works well. I am trying to increase the inference speed by INT8 mode, but the results of INT8 mode is wrong. Have you ever tried INT8 inference ? any suggestions? Thanks.

VitorGuizilini-TRI commented 4 years ago

We still haven't tried INT8 inference, so unfortunately I cannot help you with that. I will leave this issue open for now, maybe someone else can provide more information.