NobuoTsukamoto / tensorrt-examples

TensorRT Examples (TensorRT, Jetson Nano, Python, C++)
MIT License
91 stars 23 forks source link

Cmake building for wrong TRT version #3

Closed jgocm closed 2 years ago

jgocm commented 2 years ago

Hi,

I'm trying to reproduce the exact same steps from your Object Detection tutorial on Jetson Nano.

I've started from a fresh Jetpack 4.5.1 installation and executed the commands:

sudo apt remove cmake sudo snap install cmake --classic sudo reboot

cd ~ git clone https://github.com/NobuoTsukamoto/tensorrt-examples cd ./tensorrt-examples git submodule update --init --recursive

export TRT_LIBPATH=pwd/TensorRT export PATH=${PATH}:/usr/local/cuda/bin cd $TRT_LIBPATH mkdir -p build && cd build

Until now, everything works fine. But when I execute the cmake command:

cmake .. -DTRT_LIB_DIR=$TRT_LIBPATH -DTRT_OUT_DIR=pwd/out -DTRT_PLATFORM_ID=aarch64 -DCUDA_VERSION=10.2 -DCMAKE_BUILD_TYPE=Release -DCMAKE_C_COMPILER=/usr/bin/gcc

It tells me TensorRT is being built for version 8.2.0:

Building for TensorRT version: 8.2.0, library version: 8

Just to double check I ran a "dpkg -l" command and saw that my TensorRT version is actually 7.1.3. Even so, the cmake compiles with no errors and the build files get written succesfully. So I tried running make:

make -j3

And got the following error:

[  2%] Built target third_party.protobuf
[  3%] Built target third_party.protobuf_aarch64
[  4%] Built target gen_onnx_proto
[  4%] Built target caffe_proto
[  5%] Running gen_proto.py on onnx/onnx-data.in.proto
Consolidate compiler generated dependencies of target nvinfer_plugin
Consolidate compiler generated dependencies of target nvinfer_plugin_static
[  5%] Building CXX object plugin/CMakeFiles/nvinfer_plugin.dir/tfliteNMSPlugin/tfliteNMSPlugin.cpp.o
[  6%] Building CXX object plugin/CMakeFiles/nvinfer_plugin_static.dir/tfliteNMSPlugin/tfliteNMSPlugin.cpp.o
Processing /home/joao/tensorrt-examples/TensorRT/parsers/onnx/third_party/onnx/onnx/onnx-data.in.proto
Writing /home/joao/tensorrt-examples/TensorRT/build/parsers/onnx/third_party/onnx/onnx/onnx-data_onnx2trt_onnx.proto
Writing /home/joao/tensorrt-examples/TensorRT/build/parsers/onnx/third_party/onnx/onnx/onnx-data_onnx2trt_onnx.proto3
Writing /home/joao/tensorrt-examples/TensorRT/build/parsers/onnx/third_party/onnx/onnx/onnx-data.pb.h
generating /home/joao/tensorrt-examples/TensorRT/build/parsers/onnx/third_party/onnx/onnx/onnx_data_pb.py
[  6%] Running gen_proto.py on onnx/onnx-operators.in.proto
Processing /home/joao/tensorrt-examples/TensorRT/parsers/onnx/third_party/onnx/onnx/onnx-operators.in.proto
Writing /home/joao/tensorrt-examples/TensorRT/build/parsers/onnx/third_party/onnx/onnx/onnx-operators_onnx2trt_onnx-ml.proto
Writing /home/joao/tensorrt-examples/TensorRT/build/parsers/onnx/third_party/onnx/onnx/onnx-operators_onnx2trt_onnx-ml.proto3
Writing /home/joao/tensorrt-examples/TensorRT/build/parsers/onnx/third_party/onnx/onnx/onnx-operators-ml.pb.h
generating /home/joao/tensorrt-examples/TensorRT/build/parsers/onnx/third_party/onnx/onnx/onnx_operators_pb.py
[  6%] Running C++ protocol buffer compiler on /home/joao/tensorrt-examples/TensorRT/build/parsers/onnx/third_party/onnx/onnx/onnx-data_onnx2trt_onnx.proto
[  6%] Running C++ protocol buffer compiler on /home/joao/tensorrt-examples/TensorRT/build/parsers/onnx/third_party/onnx/onnx/onnx-operators_onnx2trt_onnx-ml.proto
[  6%] Building CXX object parsers/onnx/third_party/onnx/CMakeFiles/onnx_proto.dir/onnx/onnx_onnx2trt_onnx-ml.pb.cc.o
plugin/CMakeFiles/nvinfer_plugin.dir/build.make:439: recipe for target 'plugin/CMakeFiles/nvinfer_plugin.dir/tfliteNMSPlugin/tfliteNMSPlugin.cpp.o' failed
CMakeFiles/Makefile2:1361: recipe for target 'plugin/CMakeFiles/nvinfer_plugin.dir/all' failed
[  7%] Building CXX object plugin/CMakeFiles/nvinfer_plugin_static.dir/bertQKVToContextPlugin/fused_multihead_attention/src/fused_multihead_attention_fp16_64_64_kernel.sm75.cpp.o
plugin/CMakeFiles/nvinfer_plugin_static.dir/build.make:439: recipe for target 'plugin/CMakeFiles/nvinfer_plugin_static.dir/tfliteNMSPlugin/tfliteNMSPlugin.cpp.o' failed
[  7%] Building CXX object parsers/onnx/third_party/onnx/CMakeFiles/onnx_proto.dir/onnx/onnx-operators_onnx2trt_onnx-ml.pb.cc.o
CMakeFiles/Makefile2:1387: recipe for target 'plugin/CMakeFiles/nvinfer_plugin_static.dir/all' failed
[  7%] Building CXX object parsers/onnx/third_party/onnx/CMakeFiles/onnx_proto.dir/onnx/onnx-data_onnx2trt_onnx.pb.cc.o
[  7%] Linking CXX static library libonnx_proto.a
[  8%] Built target onnx_proto
Makefile:155: recipe for target 'all' failed

Would you have any other specific instructions in order to reproduce your object detection examples? I need to run tensorrt optimized object detection inference from tensorflow models and this has been the best guide I have found so far.

Thank you in advance!

NobuoTsukamoto commented 2 years ago

@jgocm

Thank you for reporting the problem.

I confirmed that a build error will occur. Modify the code. Please wait.

NobuoTsukamoto commented 2 years ago

Fixed in the commit of 16b4342895038b6bf2a6c1aa6adcf37722614136. Thank you for reporting the problem.

jgocm commented 2 years ago

Thank you for the support and the fast responses!

I was able to build TensorRT after the changes, but I think it is still building the wrong TRT version (8.2.0).

After building the "make", these are the generated files at my "TensorRT/build/out" directory:

libnvcaffeparser.so
libnvcaffeparser.so.8
libnvcaffeparser.so.8.2.0
libnvcaffeparser_static.a
libnvinfer_plugin.so
libnvinfer_plugin.so.8
libnvinfer_plugin.so.8.2.0
libnvinfer_plugin_static.a
libnvonnxparser.so
libnvonnxparser.so.8
libnvonnxparser.so.8.2.0
output.txt
sample_algorithm_selector
sample_char_rnn
sample_dynamic_reshape
sample_fasterRCNN
sample_googlenet
sample_int8
sample_int8_api
sample_mnist
sample_mnist_api
sample_nmt
sample_onnx_mnist
sample_onnx_mnist_coord_conv_ac
sample_reformat_free_io
sample_ssd
sample_uff_fasterRCNN
sample_uff_maskRCNN
sample_uff_mnist
sample_uff_plugin_v2_ext
sample_uff_ssd
trtexec

Still, I tried changing only the file names to be same as suggested on the repo:

sudo cp out/libnvinfer_plugin.so.7.2.3 /usr/lib/aarch64-linux-gnu/ sudo rm /usr/lib/aarch64-linux-gnu/libnvinfer_plugin.so.7 sudo ln -s /usr/lib/aarch64-linux-gnu/libnvinfer_plugin.so.7.2.3 /usr/lib/aarch64-linux-gnu/libnvinfer_plugin.so.7

So now I have:

libnvcaffeparser.so
libnvcaffeparser.so.7
libnvcaffeparser.so.7.2.3
libnvcaffeparser_static.a
libnvinfer_plugin.so
libnvinfer_plugin.so.7
libnvinfer_plugin.so.7.2.3
libnvinfer_plugin_static.a
libnvonnxparser.so
libnvonnxparser.so.7
libnvonnxparser.so.7.2.3
output.txt
sample_algorithm_selector
sample_char_rnn
sample_dynamic_reshape
sample_fasterRCNN
sample_googlenet
sample_int8
sample_int8_api
sample_mnist
sample_mnist_api
sample_nmt
sample_onnx_mnist
sample_onnx_mnist_coord_conv_ac
sample_reformat_free_io
sample_ssd
sample_uff_fasterRCNN
sample_uff_maskRCNN
sample_uff_mnist
sample_uff_plugin_v2_ext
sample_uff_ssd
trtexec

Then, I copied my 'ssdlite_mobilenet_v2_300x300_gs.onnx' model to 'tensorrt-examples/models' directory and tried to check the model: /usr/src/tensorrt/bin/trtexec --onnx=/home/joao/tensorrt-examples/models/ssdlite_mobilenet_v2_300x300_gs.onnx The output was:

&&&& RUNNING TensorRT.trtexec # /usr/src/tensorrt/bin/trtexec --onnx=/home/joao/tensorrt-examples/models/ssdlite_mobilenet_v2_300x300_gs.onnx
[03/08/2022-12:12:52] [I] === Model Options ===
[03/08/2022-12:12:52] [I] Format: ONNX
[03/08/2022-12:12:52] [I] Model: /home/joao/tensorrt-examples/models/ssdlite_mobilenet_v2_300x300_gs.onnx
[03/08/2022-12:12:52] [I] Output:
[03/08/2022-12:12:52] [I] === Build Options ===
[03/08/2022-12:12:52] [I] Max batch: 1
[03/08/2022-12:12:52] [I] Workspace: 16 MB
[03/08/2022-12:12:52] [I] minTiming: 1
[03/08/2022-12:12:52] [I] avgTiming: 8
[03/08/2022-12:12:52] [I] Precision: FP32
[03/08/2022-12:12:52] [I] Calibration: 
[03/08/2022-12:12:52] [I] Safe mode: Disabled
[03/08/2022-12:12:52] [I] Save engine: 
[03/08/2022-12:12:52] [I] Load engine: 
[03/08/2022-12:12:52] [I] Builder Cache: Enabled
[03/08/2022-12:12:52] [I] NVTX verbosity: 0
[03/08/2022-12:12:52] [I] Inputs format: fp32:CHW
[03/08/2022-12:12:52] [I] Outputs format: fp32:CHW
[03/08/2022-12:12:52] [I] Input build shapes: model
[03/08/2022-12:12:52] [I] Input calibration shapes: model
[03/08/2022-12:12:52] [I] === System Options ===
[03/08/2022-12:12:52] [I] Device: 0
[03/08/2022-12:12:52] [I] DLACore: 
[03/08/2022-12:12:52] [I] Plugins:
[03/08/2022-12:12:52] [I] === Inference Options ===
[03/08/2022-12:12:52] [I] Batch: 1
[03/08/2022-12:12:52] [I] Input inference shapes: model
[03/08/2022-12:12:52] [I] Iterations: 10
[03/08/2022-12:12:52] [I] Duration: 3s (+ 200ms warm up)
[03/08/2022-12:12:52] [I] Sleep time: 0ms
[03/08/2022-12:12:52] [I] Streams: 1
[03/08/2022-12:12:52] [I] ExposeDMA: Disabled
[03/08/2022-12:12:52] [I] Spin-wait: Disabled
[03/08/2022-12:12:52] [I] Multithreading: Disabled
[03/08/2022-12:12:52] [I] CUDA Graph: Disabled
[03/08/2022-12:12:52] [I] Skip inference: Disabled
[03/08/2022-12:12:52] [I] Inputs:
[03/08/2022-12:12:52] [I] === Reporting Options ===
[03/08/2022-12:12:52] [I] Verbose: Disabled
[03/08/2022-12:12:52] [I] Averages: 10 inferences
[03/08/2022-12:12:52] [I] Percentile: 99
[03/08/2022-12:12:52] [I] Dump output: Disabled
[03/08/2022-12:12:52] [I] Profile: Disabled
[03/08/2022-12:12:52] [I] Export timing to JSON file: 
[03/08/2022-12:12:52] [I] Export output to JSON file: 
[03/08/2022-12:12:52] [I] Export profile to JSON file: 
[03/08/2022-12:12:52] [I] 
----------------------------------------------------------------
Input filename:   /home/joao/tensorrt-examples/models/ssdlite_mobilenet_v2_300x300_gs.onnx
ONNX IR version:  0.0.8
Opset version:    11
Producer name:    
Producer version: 
Domain:           
Model version:    0
Doc string:       
----------------------------------------------------------------
[03/08/2022-12:12:54] [03/08/2022-12:12:54] [I] [TRT] ModelImporter.cpp:135: No importer registered for op: TFLiteNMS_TRT. Attempting to import as plugin.
[03/08/2022-12:12:54] [I] [TRT] builtin_op_importers.cpp:3659: Searching for plugin: TFLiteNMS_TRT, plugin_version: 1, plugin_namespace: 
[03/08/2022-12:12:54] [I] [TRT] builtin_op_importers.cpp:3676: Successfully created plugin: TFLiteNMS_TRT
[03/08/2022-12:12:54] 

[W] [TRT] onnx2trt_utils.cpp:220: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[F] [TRT] Assertion failed: inputs[1].nbDims == 2 || (inputs[1].nbDims == 3 && inputs[1].d[2] == 1)
/home/joao/tensorrt-examples/TensorRT/plugin/tfliteNMSPlugin/tfliteNMSPlugin.cpp:75
Aborting...

Aborted (core dumped)

The model was generated from the Add_TFLiteNMS_Plugin notebook on a host PC.

With that, I'm still not able to reproduce your project. Do you have any hints on how to solve this?

Thank you again!

NobuoTsukamoto commented 2 years ago

I was able to build TensorRT after the changes, but I think it is still building the wrong TRT version (8.2.0).

I am sorry. The source that is compatible with JetPack 4.6 or later (TensorRT 8) has been uploaded. (Updating to Tensor RT8 and it works fine with JetPack 4.6.)

I think the original JetPack 4.5.1 (TensorRT7) needs to be built with a commit of b6618342c9881460626e140e603bf3ca12803082. Since my environment is JetPack 4.6, I will downgrade to Jetpack 4.5.1 and check it.

cd tensorrt-examples/TensorRT
git checkout b6618342c9881460626e140e603bf3ca12803082
...
NobuoTsukamoto commented 2 years ago

For Jetpack 4.5.1, please build according to the following procedure. Check out by specifying the revision in the tensorrt-examples/TensorRT repository.

git clone https://github.com/NobuoTsukamoto/tensorrt-examples
cd tensorrt-examples/
git submodule update --init --recursive
export TRT_LIBPATH=`pwd`/TensorRT
export PATH=${PATH}:/usr/local/cuda/bin
cd $TRT_LIBPATH

# Note: For Jetson 4.5.1, The ONNX revision also needs to be changed with the submodule update.
git checkout b6618342c9881460626e140e603bf3ca12803082
git submodule update

mkdir -p build && cd build
cmake .. -DTRT_LIB_DIR=$TRT_LIBPATH -DTRT_OUT_DIR=`pwd`/out -DTRT_PLATFORM_ID=aarch64 -DCUDA_VERSION=10.2 -DCMAKE_BUILD_TYPE=Release -DCMAKE_C_COMPILER=/usr/bin/gcc
make -j3
sudo cp out/libnvinfer_plugin.so.7.2.3 /usr/lib/aarch64-linux-gnu/
sudo rm /usr/lib/aarch64-linux-gnu/libnvinfer_plugin.so.7
sudo ln -s /usr/lib/aarch64-linux-gnu/libnvinfer_plugin.so.7.2.3 /usr/lib/aarch64-linux-gnu/libnvinfer_plugin.so.7

The result of trtexec.

/usr/src/tensorrt/bin/trtexec --onnx=/home/jetson/ssdlite_mobilenet_v2_300x300_gs.onnx 
&&&& RUNNING TensorRT.trtexec # /usr/src/tensorrt/bin/trtexec --onnx=/home/jetson/ssdlite_mobilenet_v2_300x300_gs.onnx
[03/09/2022-18:20:08] [I] === Model Options ===
[03/09/2022-18:20:08] [I] Format: ONNX
[03/09/2022-18:20:08] [I] Model: /home/jetson/ssdlite_mobilenet_v2_300x300_gs.onnx
[03/09/2022-18:20:08] [I] Output:
[03/09/2022-18:20:08] [I] === Build Options ===
[03/09/2022-18:20:08] [I] Max batch: 1
[03/09/2022-18:20:08] [I] Workspace: 16 MB
[03/09/2022-18:20:08] [I] minTiming: 1
[03/09/2022-18:20:08] [I] avgTiming: 8
[03/09/2022-18:20:08] [I] Precision: FP32
[03/09/2022-18:20:08] [I] Calibration: 
[03/09/2022-18:20:08] [I] Safe mode: Disabled
[03/09/2022-18:20:08] [I] Save engine: 
[03/09/2022-18:20:08] [I] Load engine: 
[03/09/2022-18:20:08] [I] Builder Cache: Enabled
[03/09/2022-18:20:08] [I] NVTX verbosity: 0
[03/09/2022-18:20:08] [I] Inputs format: fp32:CHW
[03/09/2022-18:20:08] [I] Outputs format: fp32:CHW
[03/09/2022-18:20:08] [I] Input build shapes: model
[03/09/2022-18:20:08] [I] Input calibration shapes: model
[03/09/2022-18:20:08] [I] === System Options ===
[03/09/2022-18:20:08] [I] Device: 0
[03/09/2022-18:20:08] [I] DLACore: 
[03/09/2022-18:20:08] [I] Plugins:
[03/09/2022-18:20:08] [I] === Inference Options ===
[03/09/2022-18:20:08] [I] Batch: 1
[03/09/2022-18:20:08] [I] Input inference shapes: model
[03/09/2022-18:20:08] [I] Iterations: 10
[03/09/2022-18:20:08] [I] Duration: 3s (+ 200ms warm up)
[03/09/2022-18:20:08] [I] Sleep time: 0ms
[03/09/2022-18:20:08] [I] Streams: 1
[03/09/2022-18:20:08] [I] ExposeDMA: Disabled
[03/09/2022-18:20:08] [I] Spin-wait: Disabled
[03/09/2022-18:20:08] [I] Multithreading: Disabled
[03/09/2022-18:20:08] [I] CUDA Graph: Disabled
[03/09/2022-18:20:08] [I] Skip inference: Disabled
[03/09/2022-18:20:08] [I] Inputs:
[03/09/2022-18:20:08] [I] === Reporting Options ===
[03/09/2022-18:20:08] [I] Verbose: Disabled
[03/09/2022-18:20:08] [I] Averages: 10 inferences
[03/09/2022-18:20:08] [I] Percentile: 99
[03/09/2022-18:20:08] [I] Dump output: Disabled
[03/09/2022-18:20:08] [I] Profile: Disabled
[03/09/2022-18:20:08] [I] Export timing to JSON file: 
[03/09/2022-18:20:08] [I] Export output to JSON file: 
[03/09/2022-18:20:08] [I] Export profile to JSON file: 
[03/09/2022-18:20:08] [I] 
----------------------------------------------------------------
Input filename:   /home/jetson/ssdlite_mobilenet_v2_300x300_gs.onnx
ONNX IR version:  0.0.8
Opset version:    11
Producer name:    
Producer version: 
Domain:           
Model version:    0
Doc string:       
----------------------------------------------------------------
[03/09/2022-18:20:10] [W] [TRT] onnx2trt_utils.cpp:220: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[03/09/2022-18:20:10] [I] [TRT] ModelImporter.cpp:135: No importer registered for op: TFLiteNMS_TRT. Attempting to import as plugin.
[03/09/2022-18:20:10] [I] [TRT] builtin_op_importers.cpp:3659: Searching for plugin: TFLiteNMS_TRT, plugin_version: 1, plugin_namespace: 
[03/09/2022-18:20:10] [I] [TRT] builtin_op_importers.cpp:3676: Successfully created plugin: TFLiteNMS_TRT
[03/09/2022-18:21:59] [I] [TRT] Some tactics do not have sufficient workspace memory to run. Increasing workspace size may increase performance, please check verbose output.
[03/09/2022-18:23:20] [I] [TRT] Detected 1 inputs and 4 output network tensors.
[03/09/2022-18:23:20] [I] Starting inference threads
[03/09/2022-18:23:23] [I] Warmup completed 6 queries over 200 ms
[03/09/2022-18:23:23] [I] Timing trace has 87 queries over 3.08086 s
[03/09/2022-18:23:23] [I] Trace averages of 10 runs:
[03/09/2022-18:23:23] [I] Average on 10 runs - GPU latency: 35.3973 ms - Host latency: 35.5135 ms (end to end 35.5265 ms, enqueue 5.0635 ms)
[03/09/2022-18:23:23] [I] Average on 10 runs - GPU latency: 35.1875 ms - Host latency: 35.3037 ms (end to end 35.3164 ms, enqueue 4.99312 ms)
[03/09/2022-18:23:23] [I] Average on 10 runs - GPU latency: 35.3314 ms - Host latency: 35.4483 ms (end to end 35.4614 ms, enqueue 5.04764 ms)
[03/09/2022-18:23:23] [I] Average on 10 runs - GPU latency: 35.2505 ms - Host latency: 35.3669 ms (end to end 35.3799 ms, enqueue 5.15684 ms)
[03/09/2022-18:23:23] [I] Average on 10 runs - GPU latency: 35.2581 ms - Host latency: 35.3744 ms (end to end 35.3873 ms, enqueue 5.13035 ms)
[03/09/2022-18:23:23] [I] Average on 10 runs - GPU latency: 35.283 ms - Host latency: 35.3995 ms (end to end 35.4123 ms, enqueue 5.1054 ms)
[03/09/2022-18:23:23] [I] Average on 10 runs - GPU latency: 35.2608 ms - Host latency: 35.3769 ms (end to end 35.3898 ms, enqueue 5.0103 ms)
[03/09/2022-18:23:23] [I] Average on 10 runs - GPU latency: 35.3403 ms - Host latency: 35.4581 ms (end to end 35.4708 ms, enqueue 5.02485 ms)
[03/09/2022-18:23:23] [I] Host Latency
[03/09/2022-18:23:23] [I] min: 35.1873 ms (end to end 35.1951 ms)
[03/09/2022-18:23:23] [I] max: 37.6387 ms (end to end 37.6518 ms)
[03/09/2022-18:23:23] [I] mean: 35.3987 ms (end to end 35.4116 ms)
[03/09/2022-18:23:23] [I] median: 35.3347 ms (end to end 35.3474 ms)
[03/09/2022-18:23:23] [I] percentile: 37.6387 ms at 99% (end to end 37.6518 ms at 99%)
[03/09/2022-18:23:23] [I] throughput: 28.2389 qps
[03/09/2022-18:23:23] [I] walltime: 3.08086 s
[03/09/2022-18:23:23] [I] Enqueue Time
[03/09/2022-18:23:23] [I] min: 4.72998 ms
[03/09/2022-18:23:23] [I] max: 6.03662 ms
[03/09/2022-18:23:23] [I] median: 4.99207 ms
[03/09/2022-18:23:23] [I] GPU Compute
[03/09/2022-18:23:23] [I] min: 35.0728 ms
[03/09/2022-18:23:23] [I] max: 37.5224 ms
[03/09/2022-18:23:23] [I] mean: 35.2822 ms
[03/09/2022-18:23:23] [I] median: 35.2192 ms
[03/09/2022-18:23:23] [I] percentile: 37.5224 ms at 99%
[03/09/2022-18:23:23] [I] total compute time: 3.06956 s
&&&& PASSED TensorRT.trtexec # /usr/src/tensorrt/bin/trtexec --onnx=/home/jetson/ssdlite_mobilenet_v2_300x300_gs.onnx
jgocm commented 2 years ago

Nice! I followed your last instructions for Jetpack 4.5.1 and trtexec worked, getting the same console output as yours!

Then, the pycuda installation and model conversion also worked with no issues with the commands:

sudo apt install python3-dev
pip3 install --global-option=build_ext --global-option="-I/usr/local/cuda/include" --global-option="-L/usr/local/cuda/lib64" pycuda
cd ~/tensorrt-examples/python/detection/
python3 convert_onnxgs2trt.py \
     --model /home/jetson/tensorrt-examples/models/ssdlite_mobilenet_v2_300x300_gs.onnx \
     --output /home/jetson/tensorrt-examples/models/ssdlite_mobilenet_v2_300x300_fp16.trt \
     --fp16

Finally, I tried running inference with ssdlite_mobilenet_v2_300x300_fp16.trt model:

python3 /home/joao/tensorrt-examples/python/detection/trt_detection.py \
    --model /home/joao/tensorrt-examples/models/ssdlite_mobilenet_v2_300x300_fp16.trt \
    --label /home/joao/tensorrt-examples/models/coco_labels.txt \
    --width 300 \
    --height 300

But it returned me the following error:

Traceback (most recent call last):
  File "/home/joao/tensorrt-examples/python/detection/trt_detection.py", line 207, in <module>
    main()
  File "/home/joao/tensorrt-examples/python/detection/trt_detection.py", line 159, in main
    boxs = trt_outputs[1].reshape([int(trt_outputs[0]), 4])
ValueError: cannot reshape array of size 40 into shape (7,4)

Which is caused by line: boxs = trt_outputs[1].reshape([int(trt_outputs[0]), 4]) For some reason, the int(trt_outputs[0]) is not resulting on the right value. I tried hard coding it to '10', as to result on the right shape, based on the given error: boxs = trt_outputs[1].reshape([10, 4]) With that replacement the code works fine! But I'm wondering the cause of this error, would you any have clue on how to solve it?

Again, thanks so much for the assistance!

jgocm commented 2 years ago

I also kept the code running and printed trt_outputs[0] values for some time, this was the result:

10
10
10
10
9
10
10
10
10
10
10
10
10
10
10
9
9
8
10
9
8
9
6
9
6
8
8
7
6
8
7
9
7
9
7
8
9
8
10
10
9
10
9
10
10
10
9
10
10
9
9
9
8
9
8
9
5
5
5
4
5
5
3
6
5
6
8
6
5
6
7
6
7
8
8
8
9
8
6
6
6
7
8
10
5
7
7
6
5
5
6
7
6
5
6
5
5
5
7

I still don't really understand what this variable should mean, but it seems like it starts with the right value and then decays after some time.

NobuoTsukamoto commented 2 years ago

Sorry. This is a problem with trt_detection.py.
Would you please change the code and check it?

Before: https://github.com/NobuoTsukamoto/tensorrt-examples/blob/6c1eb42f829b6640ba85fb32838be1a09514fc42/python/detection/trt_detection.py#L160-L161

After:

        boxs = trt_outputs[1].reshape([-1, 4])
        for index in range(int(trt_outputs[0])):
            box = boxs[index]

trt_outputs [0] (num_detections) indicates the number of detections, from 0 to 10. (Maximum is 10 by default) Therefore, the number of detections varies depending on the input image.

jgocm commented 2 years ago

Checked here and that solves the problem!

With that I think the issue is totally solved.

I am still having some problems when trying to convert ssd_mobilenet_v1_fpn_coco from TensorFlow 1 Model Zoo using convert_onnxgs2trt.py. It gives some Assertion Errors during onnx parsing from line:

https://github.com/NobuoTsukamoto/tensorrt-examples/blob/6c1eb42f829b6640ba85fb32838be1a09514fc42/python/detection/convert_onnxgs2trt.py#L45

For now I have commented/changed some ASSERTION's on plugin/tfliteNMSPlugin/tfliteNMSPlugin.cpp code and did the conversion, but the model generated is too slow (~350ms inference time) and could not detect any objects, but I think it looks more like a new problem. I will be opening another issue for this with better details.

Thank you for the support!