WongKinYiu / yolov7

Implementation of paper - YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors
GNU General Public License v3.0
13.01k stars 4.12k forks source link

TensorRT inference with C++ for yolov7 #95

Open linghu8812 opened 1 year ago

linghu8812 commented 1 year ago

Hello every one, the repo which support yolov4: AlexeyAB/darknet#7002, scaled yolov4: WongKinYiu/ScaledYOLOv4#56, yolov5: ultralytics/yolov5#1597, and yolov6: meituan/YOLOv6#122 TensorRT inference with C++ is also support yolov7 inference, all the yolov7 pretrained model can be convert to onnx model and then to tensorrt engine.

1.Export ONNX Model

Use the following command to export onnx model: first download yolov7 models to folder weights,

git clone https://github.com/linghu8812/yolov7.git
cd yolov7
python export.py --weights ./weights/yolov7.pt --simplify --grid 

if you want to export onnx model with 1280 image size add --img-size in command:

python export.py --weights ./weights/yolov7-w6.pt --simplify --grid --img-size 1280

2.uild yolov7_trt Project

mkdir build && cd build
cmake ..
make -j

3.Run yolov7_trt

4.Results:

image

WongKinYiu commented 1 year ago

Thanks.

philipp-schmidt commented 1 year ago

@linghu8812 Your changes to the export.py are very useful. Why not make this a PR? The onnx-simplify step is necessary for the ONNX to work correctly for the Detect() layer in many cases. So it's a good idea to have that in there anyway.

linghu8812 commented 1 year ago

@linghu8812 Your changes to the export.py are very useful. Why not make this a PR? The onnx-simplify step is necessary for the ONNX to work correctly for the Detect() layer in many cases. So it's a good idea to have that in there anyway.

@philipp-schmidt I have already make a PR #114

BenRK-Work commented 1 year ago

@linghu8812 what version of onnxsim are you using? I return get the following error when trying:

Simplifier failure: [ONNXRuntimeError] : 1 : FAIL : Node (Mul_390) Op (Mul) [ShapeInferenceError] Incompatible dimensions

linghu8812 commented 1 year ago

@linghu8812 what version of onnxsim are you using? I return get the following error when trying:

Simplifier failure: [ONNXRuntimeError] : 1 : FAIL : Node (Mul_390) Op (Mul) [ShapeInferenceError] Incompatible dimensions

0.3.6

akashAD98 commented 1 year ago

@linghu8812 good work as always, is there is a good way to learn all this model optimization & quantization? can you teach us / mentoring? I really want to understand each & every point for model conversion. thanks

leeyunhome commented 1 year ago

Hello, @linghu8812

Thank you for your effort. Have you compared the performance difference with yolov5?

Thank you.

oralian commented 1 year ago

Hey, I'm getting:

Namespace(batch_size=2, device='0', dynamic=False, grid=False, img_size=[1024, 1024], simplify=True, weights='yolov7.pt')
YOLOR 🚀 v0.1-38-ge9f7c15 torch 1.10.0 CUDA:0 (Xavier, 7773.43359375MB)

Fusing layers... 
RepConv.fuse_repvgg_block
RepConv.fuse_repvgg_block
RepConv.fuse_repvgg_block
Model Summary: 306 layers, 36905341 parameters, 36905341 gradients
Killed

My using a Jetson NX and yolov7.pt. It seems that it crashes at y = model(img) in export.py with the ram reaching the maximum it can. Any ideas on how I could get this to work?

oralian commented 1 year ago

Hey, I'm getting:

Namespace(batch_size=2, device='0', dynamic=False, grid=False, img_size=[1024, 1024], simplify=True, weights='yolov7.pt')
YOLOR 🚀 v0.1-38-ge9f7c15 torch 1.10.0 CUDA:0 (Xavier, 7773.43359375MB)

Fusing layers... 
RepConv.fuse_repvgg_block
RepConv.fuse_repvgg_block
RepConv.fuse_repvgg_block
Model Summary: 306 layers, 36905341 parameters, 36905341 gradients
Killed

My using a Jetson NX and yolov7.pt. It seems that it crashes at y = model(img) in export.py with the ram reaching the maximum it can. Any ideas on how I could get this to work?

I was able to solve my issue by generating the onnx file on my desktop computer and copying it on the Jetson. I was then able to convert it to TensorRT. However, I'm getting 0.094 sec inference time with the yolov7.pt weights vs 0.058 sec with the yolov5m6.pt weights. Is it supposed to be like this? Are there any faster yolov7 models? Thanks!

mochechan commented 1 year ago

For my evaluation, the step 1 uses a computer with rtx 2080ti. This step seems to be fine. The step 2 and 3 use a nvidia jetson xavier nx with jetpack 4.5.1.

The step "3.Run yolov7_trt" occurs the following error messages. How to solve the problem?

$ ./yolov7_trt ../config.yaml ../samples
----------------------------------------------------------------
Input filename:   ../yolov7.onnx
ONNX IR version:  0.0.6
Opset version:    12
Producer name:    pytorch
Producer version: 1.10
Domain:           
Model version:    0
Doc string:       
----------------------------------------------------------------
[07/21/2022-09:32:58] [W] [TRT] onnx2trt_utils.cpp:220: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
[07/21/2022-09:32:58] [W] [TRT] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
ERROR: builtin_op_importers.cpp:3040 In function importSlice:
[4] Assertion failed: -r <= axis && axis < r
[07/21/2022-09:32:58] [E] Failure while parsing ONNX file
start building engine
[07/21/2022-09:32:58] [E] [TRT] Network must have at least one output
[07/21/2022-09:32:58] [E] [TRT] Network validation failed.
build engine done
yolov7_trt: /home/a/tensorrt_inference/yolov7/../includes/common/common.hpp:138: void onnxToTRTModel(const string&, const string&, nvinfer1::ICudaEngine*&, const int&): Assertion `engine' failed.
Aborted (core dumped)
linghu8812 commented 1 year ago

For my evaluation, the step 1 uses a computer with rtx 2080ti. This step seems to be fine. The step 2 and 3 use a nvidia jetson xavier nx with jetpack 4.5.1.

The step "3.Run yolov7_trt" occurs the following error messages. How to solve the problem?

$ ./yolov7_trt ../config.yaml ../samples
----------------------------------------------------------------
Input filename:   ../yolov7.onnx
ONNX IR version:  0.0.6
Opset version:    12
Producer name:    pytorch
Producer version: 1.10
Domain:           
Model version:    0
Doc string:       
----------------------------------------------------------------
[07/21/2022-09:32:58] [W] [TRT] onnx2trt_utils.cpp:220: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
[07/21/2022-09:32:58] [W] [TRT] onnx2trt_utils.cpp:246: One or more weights outside the range of INT32 was clamped
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
[07/21/2022-09:32:58] [E] [TRT] Mul_322: elementwise inputs must have same dimensions or follow broadcast rules (input dimensions were [1,3,80,80,2] and [1,1,1,3,2]).
ERROR: builtin_op_importers.cpp:3040 In function importSlice:
[4] Assertion failed: -r <= axis && axis < r
[07/21/2022-09:32:58] [E] Failure while parsing ONNX file
start building engine
[07/21/2022-09:32:58] [E] [TRT] Network must have at least one output
[07/21/2022-09:32:58] [E] [TRT] Network validation failed.
build engine done
yolov7_trt: /home/a/tensorrt_inference/yolov7/../includes/common/common.hpp:138: void onnxToTRTModel(const string&, const string&, nvinfer1::ICudaEngine*&, const int&): Assertion `engine' failed.
Aborted (core dumped)

use PyTorch 1.11 and onnx 1.12, the right shape of anchor should be 1x3x1x1x2

Rohan-Python commented 1 year ago

Hey can you please help me. I am facing issue when i am running inferencing on my system gpu the bounding boxes are not showing up when I am using gpu. And they are showing only when i am using cpu for inferencing (--device cpu). I am have the trained the yolov7 model on colab and am using the best.pt file as weights while inferencing.