cyrusbehr / YOLOv8-TensorRT-CPP

YOLOv8 TensorRT C++ Implementation
MIT License
568 stars 70 forks source link
computer-vision cpp machine-learning tensorrt yolo yolov8

Stargazers Issues LinkedIn All Contributors


YoloV8 TensorRT CPP

A C++ Implementation of YoloV8 using TensorRT
Supports object detection, semantic segmentation, and body pose estimation.

logo logo logo

Looking for Maintainers šŸš€

This project is actively seeking maintainers to help guide its growth and improvement. If you're passionate about this project and interested in contributing, Iā€™d love to hear from you!

Please feel free to reach out via LinkedIn to discuss how you can get involved.

Getting Started

This project demonstrates how to use the TensorRT C++ API to run GPU inference for YoloV8. It makes use of my other project tensorrt-cpp-api to run inference behind the scene, so make sure you are familiar with that project.

Prerequisites

Installation

Converting Model from PyTorch to ONNX

Building the Project

Running the Executables

INT8 Inference

Enabling INT8 precision can further speed up inference at the cost of accuracy reduction due to reduced dynamic range. For INT8 precision, calibration data must be supplied which is representative of real data the model will see. It is advised to use 1K+ calibration images. To enable INT8 inference with the YoloV8 sanity check model, the following steps must be taken:

Benchmarking

Benchmarks run on NVIDIA GeForce RTX 3080 Laptop GPU, Intel(R) Core(TM) i7-10870H CPU @ 2.20GHz using 640x640 BGR image in GPU memory and FP16 precision.

Model Total Time Preprocess Time Inference Time Postprocess Time
yolov8n 3.613 ms 0.081 ms 1.703 ms 1.829 ms
yolov8n-pose 2.107 ms 0.091 ms 1.609 ms 0.407 ms
yolov8n-seg 15.194 ms 0.109 ms 2.732 ms 12.353 ms
Model Precision Total Time Preprocess Time Inference Time Postprocess Time
yolov8x FP32 25.819 ms 0.103 ms 23.763 ms 1.953 ms
yolov8x FP16 10.147 ms 0.083 ms 7.677 ms 2.387 ms
yolov8x INT8 7.32 ms 0.103 ms 4.698 ms 2.519 ms

TODO: Need to improve postprocessing time using CUDA kernel.

How to debug

Show your appreciation

If this project was helpful to you, I would appreicate if you could give it a star. That will encourage me to ensure it's up to date and solve issues quickly.

Contributors

z3lx
z3lx

šŸ’»
Loic Tetrel
Loic Tetrel

šŸ’»
Shubham
Shubham

šŸ’»