YOLOv11 inference speed has significantly slowed down

lhj5426 commented 1 month ago

Search before asking

[X] I have searched the X-AnyLabeling Docs and issues and found no similar questions.

Question

I have both YOLOv8 and V11 versions installed on my computer, with v11 being the latest. When I use the software from https://github.com/CVHub520/X-AnyLabeling to load my YOLO8X .PT model, utilizing the same version of the software, the inference speeds are completely different between the v8 and v11 environments. In the v8 environment, it processes several images per second, while in v11, it takes several seconds to process one image. What could be causing this?

https://github.com/user-attachments/assets/0553dc31-719b-4f38-bb03-071dd40e1644

https://github.com/user-attachments/assets/aba76e0f-4133-4055-b601-bec5d2a6cbda

As shown in the video, the speed is not just slower by a factor of one.

lhj5426 commented 1 month ago

I trained a model using yolo11X.PT last night and then used the new version for inference. It seems that the 11X model loaded during inference didn’t utilize the GPU

Riki123666 commented 1 month ago

我目前也遇到了，自己写推理代码的时候很快，但是加载进自定义的时候就慢了很多，同样是使用gpu

lhj5426 commented 1 month ago

我目前也遇到了，自己写推理代码的时候很快，但是加载进自定义的时候就慢了很多，同样是使用gpu 是的我现在也是直接用代码推理模型并生成TXT 然后用软件导入推理效果应该是一样的？不太清楚反正现在这个方法推理yolo11X 模型比软件推理的快和8X一样的速度

Riki123666 commented 1 month ago

我目前发现我这边的问题了，当打开的文件夹图片比较少时速度是正常的，当图片比较多的时候（差不多15k张），速度变慢很多，不懂软件的开销在哪了

CVHub520 commented 1 month ago

Hey there! @lhj5426,

Thank you for bringing this to our attention. I've tried to reproduce the issue you described but everything seems to be working fine on my local environment. Could you please provide more details or steps to reproduce the problem? This would help us diagnose and resolve the issue more effectively.

CVHub520 commented 1 month ago

我目前发现我这边的问题了，当打开的文件夹图片比较少时速度是正常的，当图片比较多的时候（差不多15k张），速度变慢很多，不懂软件的开销在哪了

Indeed, there are currently some issues with the tool. Consequently, we recommend processing tasks in batches of no more than 5,000 images to prevent any performance degradation. This approach should enhance the speed and efficiency of the software.

lhj5426 commented 1 month ago

嘿！ @lhj5426* ，

感谢您让我们注意到这一点。我尝试重现您所描述的问题，但在我的本地环境中一切似乎都工作正常。您能否提供更多详细信息或重现该问题的步骤？这将帮助我们更有效地诊断和解决问题。

Hello, Developer.

I'm not quite sure how to express this, but here’s an overview:

I've recorded a video, and I've included the YOLO11X model file that I trained using YOLOv11. As shown in the video, I demonstrate downloading and switching to GPU inference, as well as the different YOLO versions on my computer.

I used the source code of your software, which I just downloaded from GitHub, to show the performance in both YOLOv8 and YOLOv11 environments. The first run is slower, which seems normal, but from the second time the model is loaded, you can see that the inference speed is noticeably different between the two versions.

I also demonstrated running inference directly in the YOLOv11 environment, where you can visibly see the speed is much faster. However, in the same YOLOv11 environment, inference speed using the software is noticeably slow.

Please download the video and files from this OneDrive link to review and verify.

Thank you for your response.

https://alumnialbany-my.sharepoint.com/:f:/g/personal/planetrav_alumni_albany_edu/Euq2RmW9PIVBgrQcdR_BnPsBHD_5MdTf94sk6s7X0PGweA?e=Z0kZTh

CVHub520 commented 1 month ago

Hello, @lhj5426

Thank you for your detailed explanation and for providing the video demonstration. Based on the information you've shared, it seems the performance discrepancy you're experiencing might be due to compatibility issues between the ONNX model version and your ONNXRUNTIME-GPU and CUDA versions.

Regarding your observations, I would suggest considering the use of YOLOv8 instead. It's important to note that newer models don't always guarantee better performance. In fact, YOLOv11 doesn't offer significant innovations over YOLOv8.

YOLOv8 is generally more stable, widely adopted, and likely to have better compatibility with current GPU acceleration libraries. It should provide good performance without the compatibility issues you're encountering.

If you continue to experience issues or have any questions about implementing YOLOv8, please don't hesitate to ask. We're here to help ensure you get the best performance for your specific use case.

Thank you for your feedback, as it helps us improve our software and support.

lhj5426 commented 1 month ago

Hello, @lhj5426*

Thank you for your detailed explanation and for providing the video demonstration. Based on the information you've shared, it seems the performance discrepancy you're experiencing might be due to compatibility issues between the ONNX model version and your ONNXRUNTIME-GPU and CUDA versions.

Regarding your observations, I would suggest considering the use of YOLOv8 instead. It's important to note that newer models don't always guarantee better performance. In fact, YOLOv11 doesn't offer significant innovations over YOLOv8.

YOLOv8 is generally more stable, widely adopted, and likely to have better compatibility with current GPU acceleration libraries. It should provide good performance without the compatibility issues you're encountering.

If you continue to experience issues or have any questions about implementing YOLOv8, please don't hesitate to ask. We're here to help ensure you get the best performance for your specific use case.

Thank you for your feedback, as it helps us improve our software and support.

So, is there a way to fix the compatibility issue between ONNXRUNTIME-GPU and CUDA in this software?

Additionally, this issue has made me realize that YOLO itself can generate label TXT files. So, when working with a large number of images and files, I’m now improving the speed as shown in the video by using YOLO code to perform inference and generate the label TXT files. Then, I use the software’s import label feature to bring in the labels, and finally, I proceed with any necessary adjustments.

https://github.com/user-attachments/assets/c8be8976-b20f-4c58-a3d2-bd7916e04d97

Although it’s not an ideal solution, at least it saves time. Importing labels just requires a few more mouse clicks, which is still more efficient than taking several seconds per image.

You’re right; the latest version isn’t always the best. But, I’ve only been working with YOLO for about six months, and I spent every day in September labeling data. I planned to start serious training in October. Just when I was about to begin, YOLOv11 was released, so I thought, since a new version is out and I was going to retrain anyway, why not use the latest one? Haha.

Using YOLOv8 for inference works fine too, but I still hope the compatibility issue can be resolved someday. Thanks for your response!

The most important thing is that, at first, I didn’t realize YOLOv8 and YOLOv11 use the same labeling format. I posted the same question on the official GitHub as well: https://github.com/ultralytics/ultralytics/issues/16703. Someone there mentioned that YOLOv8 and YOLOv11 are indeed the same in this regard—that's how I found out.

After all, I only got interested in using YOLO to train a recognition model on a whim. I haven’t studied it systematically and just look up information or ask AI when I encounter issues

CVHub520 commented 1 month ago

@lhj5426:

I understand your frustration with the compatibility issues between YOLO versions. It's commendable that you've put so much effort into labeling data and preparing for training. Here are some suggestions to help you move forward:

Version Compatibility: As you've discovered, YOLOv8 and YOLOv11 use the same labeling format. This is good news, as it means your labeling work isn't wasted. You can use your existing labels with either version.
If you're set on using YOLOv11, follow these steps to ensure proper GPU utilization:
- Download the latest Ultralytics version
- Install the required environment, ensuring the ONNX version matches the onnxruntime-gpu version in X-AnyLabeling
- Check the official compatibility table for version matching; E.g., for onnxruntime-gpu==1.16.x, use ONNX version 1.14.1.
- Export the ONNX model: yolo export model=yolo11x.pt format=onnx
Finally, load the exported model into X-AnyLabeling for inference as normally.

Remember, the most important thing is finding a workflow that's efficient and effective for your specific needs. Don't hesitate to experiment with different versions or approaches until you find what works best for you.

lhj5426 commented 1 month ago

PyTorch: starting from 'J:\G\Desktop\V11\runs\detect\train11\weights\best.pt' with input shape (1, 3, 640, 640) BCHW and output shape(s) (1, 10, 8400) (109.1 MB)

ONNX: starting export with onnx 1.16.1 opset 19... ONNX: slimming with onnxslim 0.1.34... ONNX: export success ✅ 39.4s, saved as 'J:\G\Desktop\V11\runs\detect\train11\weights\best.onnx' (217.1 MB) I reinstalled a new environment with the following versions. This time, the converted ONNX model is opset version 19, whereas previously it was opset 17. However, there was no improvement in inference speed. The Python version has also been updated from 3.10 to 3.12.

I set up a new virtual environment with Python version 3.8, matching the version used in my previous YOLOv8 environment. Then, I aligned these two components with the developer’s recommendations.

The inference speed remains faster with YOLOv8 ersion: 8.2 compared to YOLOv11 Version: 8.3, specifically:

lhj5426 commented 1 month ago

It's so embarrassing. I restarted my computer, and in the end, the software inference for YOLO11X in the virtual environment created with Python 3.8 can perform inference at the same speed as YOLO8X. Thank you for the developers' help.

Sorry for the trouble I caused. I set up YOLO11 based on an online video that recommended using Python 3.10, which led to a series of issues later on. I didn't think about it in that direction. Thanks again!

CVHub520 / X-AnyLabeling

YOLOv11 inference speed has significantly slowed down #659

Search before asking

Question