ultralytics / ultralytics

NEW - YOLOv8 πŸš€ in PyTorch > ONNX > OpenVINO > CoreML > TFLite
https://docs.ultralytics.com
GNU Affero General Public License v3.0
25.67k stars 5.11k forks source link

.pt ( 25 ms ) , .onnx ( 255 ms ) , .engine ( 19 ms ) , .ncnn ( 180 ms ) on Quadro T2000 with Max-Q with 4 gb vram is this normal for the onnx #12962

Closed MohamedKHALILRouissi closed 1 day ago

MohamedKHALILRouissi commented 1 month ago

Search before asking

Question

after reading the documentation onnx should run faster then the pt , in my case the onnx is the slowest of them all chain of execution: 1) yolo export model=yolov8n.pt format=onnx int8=True dynamic=False simplify=True workspace=8 device=0 imgsz=640 2) run this code

from ultralytics import YOLO
import torch
import cv2
EXTENSION = "engine"
# Load the YOLOv8 model
model = YOLO(f"yolov8n.onnx",task="detect")

cap = cv2.VideoCapture("video.mp4")
assert cap.isOpened(), "Error reading video file"
w, h, fps = (int(cap.get(x)) for x in (cv2.CAP_PROP_FRAME_WIDTH, cv2.CAP_PROP_FRAME_HEIGHT, cv2.CAP_PROP_FPS))

# Loop through the video frames
while cap.isOpened():
    # Read a frame from the video
    success, frame = cap.read()
    if success:
        # Run YOLOv8 inference on the frame
        results = model(frame, verbose=True, device=0,stream=True,classes=[0])
                # Visualize the results on the frame
        for res in results:
            annotated_frame = res.plot()
        # Display the annotated frame
        cv2.imshow("YOLOv8 Inference", annotated_frame)
        if cv2.waitKey(1) & 0xFF == ord("q"):
            break
    else:
        # Break the loop if the end of the video is reached
        break
# Release the video capture object and close the display window
cap.release()
cv2.destroyAllWindows()

FOR THE ONNX yolo verbose output Loading yolov8n.onnx for ONNX Runtime inference... /home/khalil/.local/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py:69: UserWarning: Specified provider 'CUDAExecutionProvider' is not in available provider names.Available providers: 'AzureExecutionProvider, CPUExecutionProvider' warnings.warn(

0: 640x640 (no detections), 96.3ms Speed: 31.2ms preprocess, 96.3ms inference, 1038.2ms postprocess per image at shape (1, 3, 640, 640)

0: 640x640 (no detections), 154.0ms Speed: 6.6ms preprocess, 154.0ms inference, 2.7ms postprocess per image at shape (1, 3, 640, 640)

0: 640x640 (no detections), 205.3ms Speed: 5.6ms preprocess, 205.3ms inference, 3.2ms postprocess per image at shape (1, 3, 640, 640)

0: 640x640 (no detections), 245.0ms Speed: 6.3ms preprocess, 245.0ms inference, 3.7ms postprocess per image at shape (1, 3, 640, 640)

0: 640x640 (no detections), 232.1ms Speed: 7.7ms preprocess, 232.1ms inference, 3.6ms postprocess per image at shape (1, 3, 640, 640)

subquestion: let say that i have multiple camera each have different resolution do i have to preprocess and resize the frame

Run YOLOv8 inference on the frame

    frame = cv2.resize(frame,(640,384))
    results = model(frame, verbose=True, device=0,stream=True,classes=[0])

or do the model it self apply this step ?

Additional

No response

glenn-jocher commented 1 month ago

Hello! It seems like your ONNX model is not utilizing the GPU, which is likely causing the slower inference times compared to the .pt and .engine formats. The warning about the 'CUDAExecutionProvider' not being available suggests that ONNX Runtime isn't configured to use CUDA on your system. You might need to ensure that the CUDA version of ONNX Runtime is properly installed.

Regarding your subquestion, yes, you will need to preprocess and resize the frames to match the input size expected by the model (640x640 in your case) if the incoming video streams have different resolutions. The model does not automatically resize input images, so this step is necessary to maintain consistency in input data format.

Here’s a quick check you can do to ensure ONNX Runtime is using CUDA:

import onnxruntime
print("Available providers:", onnxruntime.get_available_providers())

This command will list the available execution providers. Ensure that 'CUDAExecutionProvider' is listed. If it's not, you may need to reinstall ONNX Runtime with CUDA support. For more detailed guidance, you might find the ONNX Runtime documentation helpful.

MohamedKHALILRouissi commented 1 month ago

Available providers: ['AzureExecutionProvider', 'CPUExecutionProvider'] Loading compiled/yolov8n.onnx for ONNX Runtime inference... UserWarning: Specified provider 'CUDAExecutionProvider' is not in available provider names.Available providers: 'AzureExecutionProvider, CPUExecutionProvider'

`Tue May 21 11:27:25 2024
+---------------------------------------------------------------------------------------+ | NVIDIA-SMI 545.23.08 Driver Version: 545.23.08 CUDA Version: 12.3 | |-----------------------------------------+----------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+======================+======================| | 0 Quadro T2000 with Max-Q ... On | 00000000:01:00.0 Off | N/A | | N/A 61C P8 3W / 20W | 5MiB / 4096MiB | 0% Default | | | | N/A | +-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=======================================================================================| | 0 N/A N/A 1899 G /usr/lib/xorg/Xorg 4MiB | +---------------------------------------------------------------------------------------+`

am not sure what's wrong in my settings ?

glenn-jocher commented 1 month ago

Hello! It looks like your ONNX Runtime isn't configured to use the GPU, as it's not listing 'CUDAExecutionProvider' among the available providers. This is likely why you're seeing slower inference times and the warning about the CUDA provider not being available.

To resolve this, you'll need to ensure that ONNX Runtime is installed with CUDA support. You can do this by installing the CUDA-enabled ONNX Runtime package. Here's a quick command to install it:

pip install onnxruntime-gpu

Make sure to uninstall any existing ONNX Runtime installations before doing this to avoid conflicts. After installation, you can verify that CUDA is enabled by checking the available providers again:

import onnxruntime
print(onnxruntime.get_available_providers())

You should see 'CUDAExecutionProvider' in the list. If the issue persists, ensure that your NVIDIA drivers and CUDA are correctly installed and up to date, as indicated by your nvidia-smi output. Hope this helps! πŸš€

github-actions[bot] commented 1 week ago

πŸ‘‹ Hello there! We wanted to give you a friendly reminder that this issue has not had any recent activity and may be closed soon, but don't worry - you can always reopen it if needed. If you still have any questions or concerns, please feel free to let us know how we can help.

For additional resources and information, please see the links below:

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLO πŸš€ and Vision AI ⭐