yolov8n-pose.mlpackage检测时间不稳定

sjz-hub commented 12 months ago

Search before asking

[X] I have searched the YOLOv8 issues and discussions and found no similar questions.

Question

当我将yolov8n-pose.pt模型使用yolo export转换为ios的coreml模型后，并且将其放置到ios应用中，发现其检测时间非常不稳定，请问这是什么原因，下图是我的一张截图。最快时间和最慢时间差了一倍

Additional

No response

github-actions[bot] commented 12 months ago

👋 Hello @sjz-hub, thank you for your interest in YOLOv8 🚀! We recommend a visit to the YOLOv8 Docs for new users where you can find many Python and CLI usage examples and where many of the most common questions may already be answered.

If this is a 🐛 Bug Report, please provide a minimum reproducible example to help us debug it.

If this is a custom training ❓ Question, please provide as much information as possible, including dataset image examples and training logs, and verify you are following our Tips for Best Training Results.

Join the vibrant Ultralytics Discord 🎧 community for real-time conversations and collaborations. This platform offers a perfect space to inquire, showcase your work, and connect with fellow Ultralytics users.

Install

Pip install the ultralytics package including all requirements in a Python>=3.8 environment with PyTorch>=1.8.

pip install ultralytics

Environments

YOLOv8 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

Notebooks with free GPU:
Google Cloud Deep Learning VM. See GCP Quickstart Guide
Amazon Deep Learning AMI. See AWS Quickstart Guide
Docker Image. See Docker Quickstart Guide

Status

If this badge is green, all Ultralytics CI tests are currently passing. CI tests verify correct operation of all YOLOv8 Modes and Tasks on macOS, Windows, and Ubuntu every 24 hours and on every commit.

sjz-hub commented 12 months ago

当我使用yolov8n.pt转换后的coreml模型的检测时间旧非常稳定

glenn-jocher commented 12 months ago

@sjz-hub thank you for using YOLOv8 and bringing up your concern. Inference time can be affected by several factors:

Variations in Input Data: Differences in complexity of the input can lead to variations in the time required for detection. More complex images or scenes may require more computational resources and hence more time.
Device Performance: If you're testing the model on a device like a smartphone, the performance can fluctuate depending on other processes running on the device at the same time.
Thermal Throttling: Mobile devices often slow down the processor speed to prevent overheating when running demanding tasks for a long time.
Model Size and Complexity: Larger and more complex models like yolov8n-pose tend to take more time to process inputs.

For more consistent results, you might consider running your tests under controlled conditions, such as ensuring a consistent device temperature and eliminating background processes that could be consuming resources. If you are running it in a production environment and speed is critical, you can also consider techniques like model quantization and pruning, which can help reduce the model size and thus potentially improve inference speed.

I hope this information is helpful.

Zephyr69 commented 12 months ago

Is this inference time only? Cuz with a model this small, NMS time and/or other post-processing time will become a significant part of the total execution time, some of which may grow as the number of candidates grows (like NMS).

glenn-jocher commented 12 months ago

@Zephyr69 thank you for reaching out with your question regarding inference time.

You are correct in understanding that the inference time typically refers to the duration the model takes to process an input (like an image or video frame) and output the predictions. However, this is not the full picture when it comes to the actual runtime performance of object detection models such as YOLOv8.

Post-processing steps, such as non-maximum suppression (NMS), indeed play a crucial role in the total execution time, particularly with smaller models where the relative cost of these steps could be more significant compared to the actual inference. The computational load for NMS can indeed increase with the number of candidate detections prior to suppression, which may result in variability in execution times depending on the contents of the input data.

It is important to consider both the inference and post-processing when evaluating the performance of a model in a real-world application scenario. Optimizing the entire pipeline, including inference and post-processing, is key to achieving the best overall runtime performance.

AphroDatalyst commented 11 months ago

Hey! I am currently facing significant challenges with inference latency when deploying YOLOv8 large and xlarge models on an iPhone device through Xcode. The models are being tested in a real-time application, where the inference speed is crucial. This latency is causing noticeable lags in real-time performance.

Model Conversion Command:

!pip install -q coremltools==6.3.0
!yolo export model=yolov8x-pose.pt format=coreml nms=True

Device: iPhone (running real-time application in Xcode)
Frame Rate for Prediction: Estimating 10-15 frames per second with a source FPS of 60
Models Inference Times:
- Large mlpackage: 0.151s
- Xlarge mlpackage: 0.151s
- Nano mlpackage: 0.101s

Given these challenges, I am seeking advice on optimizing the models for better performance without compromising accuracy. Is there a way to reduce inference time, especially in the context of real-time applications on iOS devices?

glenn-jocher commented 11 months ago

@AphroDatalyst hello! The latency you're experiencing while running YOLOv8 large and xlarge models on iPhone indicates a need for optimization. Here are a few steps you can consider for improving inference times:

Model Optimization:
- Experiment with quantization to reduce the model size.
- Investigate pruning to remove redundant parts of the neural network.
Inference Efficiency:
- Prioritize simpler models if the performance meets your requirements.
- Adjust the NMS threshold to balance speed and accuracy.
Application Optimization:
- Evaluate and reduce the overhead of pre and post-processing in your application.
- Ensure that your application prioritizes the model inference thread.

Bear in mind that achieving real-time performance on mobile devices can be challenging with larger models. Consider conducting a trade-off analysis between model complexity and inference speed to meet your application’s real-time needs.

github-actions[bot] commented 10 months ago

👋 Hello there! We wanted to give you a friendly reminder that this issue has not had any recent activity and may be closed soon, but don't worry - you can always reopen it if needed. If you still have any questions or concerns, please feel free to let us know how we can help.

For additional resources and information, please see the links below:

Docs: https://docs.ultralytics.com
HUB: https://hub.ultralytics.com
Community: https://community.ultralytics.com

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLO 🚀 and Vision AI ⭐

ultralytics / ultralytics