455670288 / rknn-yolov8s-multi-thread-inference

yolov8s在rk3588的推理部署,并使用多线程池并行npu推理加速
1 stars 1 forks source link

Project closed? #1

Open AshleyRoth opened 1 week ago

AshleyRoth commented 1 week ago

Hi!

You haven't finished the project? So far the only multithreaded project that I have found. Unfortunately CMakeLists.txt is incorrect and the files specified in it are missing

455670288 commented 1 week ago

You can look at the src/compile.sh file and modify the instructions to compile with g++

Hi!

You haven't finished the project? So far the only multithreaded project that I have found. Unfortunately CMakeLists.txt is incorrect and the files specified in it are missing

AshleyRoth commented 1 week ago

@455670288 Hi! Thank you!

Did you manage to run yolov8? what fps did it work with?

I was able to run the example successfully. However, only 1 NPU core works, and gets 16-17 FPS. In the coreNum.cc file it is specified "const int RK3588 = 3"

UPD: I changed the value of threadNum in the main file and all 3 NPUs started working

AshleyRoth commented 6 days ago

@455670288 Hi. Thanks for the answer! Can you tell me more about s-p2(how i can train my model like s-p2)? I tried to convert my yolov8 model. My exported model in rknn_model_zoo yolov8 works correctly, but when I try to run my model in your project, a lot of bboxes are created and npus are not loaded. (that is, the detection works absolutely incorrectly.). Maybe your code has a binding to coco yolo 80 classes?

455670288 commented 6 days ago

@455670288 Hi. Thanks for the answer! Can you tell me more about s-p2(how i can train my model like s-p2)? I tried to convert my yolov8 model. My exported model in rknn_model_zoo yolov8 works correctly, but when I try to run my model in your project, a lot of bboxes are created and npus are not loaded. (that is, the detection works absolutely incorrectly.). Maybe your code has a binding to coco yolo 80 classes?

The specific training process follows official ultralytics. The training profile for 8s-p2: https://github.com/airockchip/ultralytics_yolov8/blob/main/ultralytics/cfg/models/v8/yolov8-p2.yaml

The s-p2 model has an additional 20 × 20 size (default input 640 × 640) output header than the original v8s, which is beneficial for detection of smaller targets. The post-processing for this project has been modified to fit the four output detection headers; if you need to deploy the original version of v8s, refer to the post-processing for rknn model zoo.

Customize the number of classes for your model in the src/postprocess.h file

AshleyRoth commented 5 days ago

@455670288 Thanks, I'll try to teach it on s-p2