ultralytics / yolo-flutter-app

A Flutter plugin for Ultralytics YOLO computer vision models
https://ultralytics.com
GNU Affero General Public License v3.0
36 stars 14 forks source link

Low FPS Issue with Camera Detection #35

Open congngc opened 1 week ago

congngc commented 1 week ago

Hello,

I have implemented the example from this GitHub repository: https://github.com/ultralytics/yolo-flutter-app/tree/main/example. However, I am experiencing low frame rates with the camera detection feature, which ranges only from 12 to 20 FPS. Could you please advise on how I might improve the FPS to achieve better performance?

Thank you for your assistance.

https://github.com/ultralytics/yolo-flutter-app/assets/171897012/23799dad-8d12-47cc-9402-22d76d2056c7

pderrenger commented 1 week ago

Hello,

Thank you for reaching out and providing details about the low FPS issue you're experiencing with the camera detection feature. To help improve the frame rate, here are a few suggestions:

  1. Model Quantization: Ensure that you are using a quantized model (FP16 or INT8) as these are optimized for performance on mobile devices. Quantization reduces the model size and computation requirements, leading to faster inference times. You can read more about this in our documentation.

  2. Delegate Selection: The performance can vary significantly depending on the delegate used for model inference. For instance, using the GPU delegate can provide a substantial performance boost on devices with powerful GPUs. Similarly, if your device supports it, leveraging the Hexagon DSP or NNAPI can also improve performance. You can find more details on delegates and their performance variability in our documentation.

  3. Device Specifications: The hardware capabilities of your device play a crucial role in performance. Ensure that your device has a capable processor and sufficient memory. Devices with Qualcomm Snapdragon processors, for example, can leverage the Hexagon DSP for better performance.

  4. Code Optimization: Make sure that your implementation is optimized. For example, reducing the input resolution can help increase the FPS, though it may affect detection accuracy. Here’s a snippet to adjust the input resolution:

    import cv2
    
    # Reduce input resolution
    def resize_frame(frame, width, height):
        return cv2.resize(frame, (width, height))
    
    # Example usage
    frame = resize_frame(frame, 320, 240)  # Adjust width and height as needed
  5. Latest Versions: Verify that you are using the latest versions of the Ultralytics packages and dependencies. Updates often include performance improvements and bug fixes.

If the issue persists, could you please provide a minimum reproducible example? This will help us better understand the problem and provide more specific guidance. You can find more information on creating a reproducible example here.

Thank you for your cooperation, and I look forward to assisting you further!

congngc commented 1 week ago

Hi @pderrenger ,

Thanks for your insights. I'd like to provide some details about my setup to further the discussion:

  1. Model Quantization: I am currently using the quantized INT8 model of YOLOv8n.

  2. Device Specifications: My device is a Samsung Galaxy S21. In your experience, do you think this device could handle the computational demands effectively?

  3. Code Optimization: I am looking for ways to optimize the UltralyticsYoloCameraController function. Any tips on specific aspects of the code that might benefit from fine-tuning?

  4. Latest Versions: I am using the latest version of Ultralytics software. Could there be any upcoming updates that might help in improving efficiency or accuracy?

Your expertise and suggestions would be greatly appreciated!

pderrenger commented 1 week ago

Hi @congngc,

Thank you for providing additional details about your setup. Let's dive into each point to help you optimize your camera detection performance:

  1. Model Quantization: Great to hear that you're using the INT8 quantized model. This should indeed help with performance.

  2. Device Specifications: The Samsung Galaxy S21 is a powerful device with a robust Snapdragon processor, which should handle the computational demands effectively. Leveraging the GPU or NNAPI delegates can further enhance performance. You can switch delegates in your code to see which one offers the best performance on your device.

  3. Code Optimization: Optimizing the UltralyticsYoloCameraController function can significantly impact performance. Here are a few tips:

    • Reduce Input Resolution: Lowering the resolution of the input frames can reduce the computational load. For example, resizing the frames to 320x240 or 640x480 can help.
    • Batch Processing: If feasible, process frames in batches rather than individually to take advantage of parallel processing capabilities.
    • Delegate Selection: Experiment with different delegates (CPU, GPU, NNAPI) to find the optimal one for your device. Here’s a snippet to switch delegates:
    import tensorflow as tf
    
    # Example of setting the GPU delegate
    interpreter = tf.lite.Interpreter(model_path="model.tflite", experimental_delegates=[tf.lite.experimental.load_delegate('libedgetpu.so.1')])
    interpreter.allocate_tensors()
  4. Latest Versions: It's excellent that you're using the latest version of the Ultralytics software. The development team continuously works on updates that enhance performance and accuracy. Keep an eye on the Ultralytics GitHub repository for any new releases.

Additionally, if you encounter any specific issues or bugs, providing a minimum reproducible example can be incredibly helpful for us to diagnose and address the problem efficiently. You can find more information on creating a reproducible example here.

Feel free to reach out if you have any further questions or need more assistance. We're here to help! 😊

congngc commented 1 week ago

Hi @pderrenger, I have already implemented all the suggestions provided, including using a quantized model, selecting the appropriate delegate, optimizing my device settings, and updating the software. Despite these adjustments, I'm still experiencing low FPS. Could there be other factors affecting the performance? Any further assistance would be greatly appreciated.

pderrenger commented 1 week ago

Hi @congngc,

Thank you for your detailed follow-up and for implementing the suggestions provided. It's great to see your proactive approach! Given that you've already optimized the model, delegate, device settings, and software, let's explore a few additional factors that might be affecting the performance:

  1. Background Processes: Ensure that there are no other intensive applications or background processes running on your device, as these can consume resources and impact performance.

  2. Thermal Throttling: Extended use of the camera and intensive processing can cause the device to heat up, leading to thermal throttling. This can reduce the performance of the CPU and GPU. Try to keep the device cool and monitor its temperature during use.

  3. Camera Frame Rate: Check the camera settings to ensure that it is set to the highest possible frame rate. Sometimes, the camera itself might be limiting the FPS.

  4. Model Complexity: While you are using a quantized model, the complexity of the model (e.g., YOLOv8n) might still be a factor. Consider experimenting with even lighter models if available, or reducing the input image size further.

  5. Code Profiling: Profile your code to identify any bottlenecks. Tools like Android Studio Profiler can help you pinpoint where the most time is being spent during inference and frame processing.

  6. Thread Management: Ensure that the inference and camera processing are running on separate threads to avoid blocking the main UI thread. Here’s a basic example of how you might handle threading in Python:

    import threading
    
    def process_frame(frame):
        # Your frame processing code here
        pass
    
    def camera_loop():
        while True:
            frame = get_camera_frame()
            threading.Thread(target=process_frame, args=(frame,)).start()
    
    camera_loop()

If the issue persists, providing a minimum reproducible example would be incredibly helpful for us to diagnose the problem more effectively. You can find guidance on creating one here.

Thank you for your patience and cooperation. We're committed to helping you achieve the best performance possible. If you have any further questions or need more assistance, feel free to reach out! 😊

fransay commented 1 week ago

@congngc can you share the specs of the device you are testing on ?. I think one of the many ways we can increase fps is to spawn new isolates/thread to handle camera inference, but this is a tricky process, since you want to have a synchronous real time effect of the the box appearing on screen, the inference engine make predictions. Myself, I tested on a quite low end device, honor x6a with 4GB RAM and an octacore cpu (mediatek helio). FPS is in the range of 1-3. I think it is an interesting challenge to look into, a slight improvement in algorithmic processes can win us some hardware magic. I will keep this issue updated on my work into fps optimisation.