NVIDIA-AI-IOT / torch2trt

An easy to use PyTorch to TensorRT converter
MIT License
4.6k stars 676 forks source link

converting .pth to .trt engine, do inference in C++, input and output names not matched #405

Open cam401 opened 4 years ago

cam401 commented 4 years ago

Firstly, thanks for this project that is of high quality.

I did inference in Python API from .pth model to .trt. The speed-up is very impressive.

However, I also need to do the same inference in C++ API because of the pre-process and post-process in Python are not very ideal.

I converted the .pth file to .trt engine file, which was loaded (parserd) by C++ API successfully (I suppose).

However, when doing inference, the code gives out an error of " can find binding of given name" that was defined as input and output.

I suppose the input and output names have to be specified based on the computational graph as well (In python, it is not necessary).

Now, I wonder how I can find out the names for input and output nodes. (for other model formats, netron can be used to visualise and check input and output names).

Thanks and I look forward to the support.

jaybdub commented 4 years ago

Hi cam401,

Thanks for reaching out!

By default, the input/output names are given as input_0, input_1 and output_0, output_1, output_2, etc. So for a single input, single output model these would be input_0 and output_0.

Just to check, you can use the TensorRT Python api

# model_trt from converting model with torch2trt
engine = model_trt.engine

for idx in range(engine.num_bindings):
    is_input = engine.binding_is_input(idx)
    name = engine.get_binding_name(idx)
    print(idx, is_input, name)

Please let me know if this helps or you run into issues.

Best, John

cam401 commented 4 years ago

Hi John,

Thanks for your prompt response.

That helped me solve the problem and I now can do inference in C++ based on .trt engine file.

_

**_> However, I got the following warnings:

[TRT] Parameter check failed at: engine.cpp::executeV2::811, condition: !mEngine.hasImplicitBatchDimension()

and also all inference results are zeros, which are obviously not correct. I will look into this further. At leas the code can run through._**

This has been addressed.

However, the inference speed via C++ API seems to be much slower than via Python API (~5 times slower, batchsize=1). _

Best,

c

maiminh1996 commented 3 years ago

@cam401 Can you share your inference C++ code?

Zhangppppp commented 2 years ago

@maiminh1996 Can you share your inference C++ code?Thanks.

sunshinesjw1 commented 2 years ago

@cam401 Hello, for this question: "the inference speed via C++ API seems to be much slower than via Python API (~5 times slower, batchsize=1)", have you solved? Can you give me some advices for this situation?Thanks, and wait for your rely.