ultralytics / ultralytics

Ultralytics YOLO11 πŸš€
https://docs.ultralytics.com
GNU Affero General Public License v3.0
36.33k stars 7k forks source link

Getting this error when using Yolov11 #19059

Open hariv0 opened 6 days ago

hariv0 commented 6 days ago

Search before asking

Ultralytics YOLO Component

No response

Bug

Getting this error while using yolov11 :

ERROR - CUDA error: an illegal instruction was encountered CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1 Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

Environment

Environment Information :

Ultralytics 8.3.68 πŸš€ Python-3.12.8 torch-2.5.1+cu124 CUDA:0 (NVIDIA A100-SXM4-40GB MIG 2g.10gb, 9984MiB)

Minimal Reproducible Example

Happening incases of high load ( High parallel inference requests )

Additional

No response

Are you willing to submit a PR?

UltralyticsAssistant commented 6 days ago

πŸ‘‹ Hello @hariv0, thank you for bringing this to our attention πŸš€! We recommend checking out the Docs for guidance, where you can explore examples on Python and CLI usage. These resources may help identify potential issues or solutions.

It looks like you are encountering a πŸ› bug. To help us diagnose and resolve this issue, could you please provide a minimum reproducible example (MRE)? This will greatly assist our team in debugging the problem effectively.

In the meantime, ensure you’re using the latest version of the ultralytics package. You can upgrade it using the following command:

pip install -U ultralytics

Also, verify that your environment meets the required specifications by reviewing the dependencies in our requirements file. The required Python version is 3.8 or higher, and PyTorch version is 1.8 or above.

For real-time community support, you can join our Discord channel 🎧. For detailed questions and discussions, check out Discourse or share your insights on our Subreddit.

Environments

You can experiment or test in recommended verified environments which come pre-configured with dependencies:

Status

Ultralytics CI

This badge reflects the current status of all Ultralytics CI tests, which validate compatibility across macOS, Windows, and Ubuntu on a 24-hour basis and with every commit.

This response is automated πŸ€–, but an Ultralytics engineer will review your issue and assist as soon as possible. Thank you for your patience and for using Ultralytics! πŸš€

Y-T-G commented 6 days ago

The provided information is incomplete. You need to provide the full error and stack trace and the code/command used and the full environment details output by yolo checks.

hariv0 commented 5 days ago

I am using Yolov11 model in a python fastapi as a singleton object, and the error only happens sometime in cases of high load and does not happen some other time for same or higher loads. This is the yolo checks output :

Image

I am using predict with yolov11 tensorRT model for object detection.

glenn-jocher commented 3 days ago

@hariv0 it appears that running a singleton model in high-parallel FastAPI scenarios may lead to intermittent CUDA issues; please try serializing predictions (e.g., via a threading lock) or using separate model instances per request, and enable CUDA_LAUNCH_BLOCKING for more targeted diagnostics.

hariv0 commented 3 days ago

@glenn-jocher Deploying seperate models per request is not a feasible idea for us. Any other ways we can get around this error? This is the code that we have used :

import cv2
import os

from ultralytics import YOLO
import torch

class Yolo:
    def __init__(self, model: str = "yolo11c", engine_model: str = "yolo11c.engine"):
        """
        Initializes a YOLO object detection model.
        If the engine model does not exist, it exports it first.
        """
        if not os.path.exists(engine_model):
            print(f"{engine_model} not found. Exporting model first...")
            self.ExportModel(model)  # Export before loading

        # Now load the exported model
        self.Model = YOLO(engine_model, task="detect")

    def ExportModel(self, model_path: str):
        """
        Exports the model to TensorRT engine format.
        """
        print(f"Exporting {model_path} to TensorRT engine...")
        Pt_Model = YOLO(model_path)
        Pt_Model.export(format="engine", device=0)  # Creates 'model.engine'

    def Predict(self, img: any, conf: float = 0.5):
        """
        Uses the preloaded model to perform prediction 
        """
        results = self.Model.predict(img, conf=conf, device=0, stream=True)
        results = [i for i in results]
        labels, probs, boxes = [], [], []
        if results is not None:
            for box in results[0]:
                result = box.cpu().boxes
                l, p, b = box.names[int(result.cls[0])], result.conf.tolist()[0], result.data.tolist()
                b = b[0][:-2]
                p = round(p*100,2)
                labels.append(l)
                probs.append(p)
                boxes.append(b)
        return boxes, labels, probs

# if this script is the main script
if __name__=="__main__":
    model = YOLO("/workspaces/ObjectDetectionPytorch/ObjectDetection/Models/Yolov9/yolov9c.pt")
    dirPath = r"E:\svnwc\__GitLabMaintainedProjects__\ObjectDetectionPytorch\Tests\TestImages"
    for filename in os.listdir(dirPath):
        filpath = os.path.join(dirPath, filename)
        print(f"Image name: {filename}")
        boxes, labels, scores = model.Predict(cv2.imread(filpath)[:,:,::-1])
        print(f"Result: {list(zip(labels, scores))}")
        break

obj_detect = Yolo(model=_model_dir, engine_model=_engine_model_dir) # for yolov11

box, labels, scores = obj_detect.Predict(img, conf=0.5)
Y-T-G commented 3 days ago

You should use Triton Inference Server or LitServe