Open hariv0 opened 6 days ago
π Hello @hariv0, thank you for bringing this to our attention π! We recommend checking out the Docs for guidance, where you can explore examples on Python and CLI usage. These resources may help identify potential issues or solutions.
It looks like you are encountering a π bug. To help us diagnose and resolve this issue, could you please provide a minimum reproducible example (MRE)? This will greatly assist our team in debugging the problem effectively.
In the meantime, ensure youβre using the latest version of the ultralytics
package. You can upgrade it using the following command:
pip install -U ultralytics
Also, verify that your environment meets the required specifications by reviewing the dependencies in our requirements file. The required Python version is 3.8 or higher, and PyTorch version is 1.8 or above.
For real-time community support, you can join our Discord channel π§. For detailed questions and discussions, check out Discourse or share your insights on our Subreddit.
You can experiment or test in recommended verified environments which come pre-configured with dependencies:
This badge reflects the current status of all Ultralytics CI tests, which validate compatibility across macOS, Windows, and Ubuntu on a 24-hour basis and with every commit.
This response is automated π€, but an Ultralytics engineer will review your issue and assist as soon as possible. Thank you for your patience and for using Ultralytics! π
The provided information is incomplete. You need to provide the full error and stack trace and the code/command used and the full environment details output by yolo checks
.
I am using Yolov11 model in a python fastapi as a singleton object, and the error only happens sometime in cases of high load and does not happen some other time for same or higher loads. This is the yolo checks output :
I am using predict with yolov11 tensorRT model for object detection.
@hariv0 it appears that running a singleton model in high-parallel FastAPI scenarios may lead to intermittent CUDA issues; please try serializing predictions (e.g., via a threading lock) or using separate model instances per request, and enable CUDA_LAUNCH_BLOCKING for more targeted diagnostics.
@glenn-jocher Deploying seperate models per request is not a feasible idea for us. Any other ways we can get around this error? This is the code that we have used :
import cv2
import os
from ultralytics import YOLO
import torch
class Yolo:
def __init__(self, model: str = "yolo11c", engine_model: str = "yolo11c.engine"):
"""
Initializes a YOLO object detection model.
If the engine model does not exist, it exports it first.
"""
if not os.path.exists(engine_model):
print(f"{engine_model} not found. Exporting model first...")
self.ExportModel(model) # Export before loading
# Now load the exported model
self.Model = YOLO(engine_model, task="detect")
def ExportModel(self, model_path: str):
"""
Exports the model to TensorRT engine format.
"""
print(f"Exporting {model_path} to TensorRT engine...")
Pt_Model = YOLO(model_path)
Pt_Model.export(format="engine", device=0) # Creates 'model.engine'
def Predict(self, img: any, conf: float = 0.5):
"""
Uses the preloaded model to perform prediction
"""
results = self.Model.predict(img, conf=conf, device=0, stream=True)
results = [i for i in results]
labels, probs, boxes = [], [], []
if results is not None:
for box in results[0]:
result = box.cpu().boxes
l, p, b = box.names[int(result.cls[0])], result.conf.tolist()[0], result.data.tolist()
b = b[0][:-2]
p = round(p*100,2)
labels.append(l)
probs.append(p)
boxes.append(b)
return boxes, labels, probs
# if this script is the main script
if __name__=="__main__":
model = YOLO("/workspaces/ObjectDetectionPytorch/ObjectDetection/Models/Yolov9/yolov9c.pt")
dirPath = r"E:\svnwc\__GitLabMaintainedProjects__\ObjectDetectionPytorch\Tests\TestImages"
for filename in os.listdir(dirPath):
filpath = os.path.join(dirPath, filename)
print(f"Image name: {filename}")
boxes, labels, scores = model.Predict(cv2.imread(filpath)[:,:,::-1])
print(f"Result: {list(zip(labels, scores))}")
break
obj_detect = Yolo(model=_model_dir, engine_model=_engine_model_dir) # for yolov11
box, labels, scores = obj_detect.Predict(img, conf=0.5)
You should use Triton Inference Server or LitServe
Search before asking
Ultralytics YOLO Component
No response
Bug
Getting this error while using yolov11 :
ERROR - CUDA error: an illegal instruction was encountered CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1 Compile with
TORCH_USE_CUDA_DSA
to enable device-side assertions.Environment
Environment Information :
Ultralytics 8.3.68 π Python-3.12.8 torch-2.5.1+cu124 CUDA:0 (NVIDIA A100-SXM4-40GB MIG 2g.10gb, 9984MiB)
Minimal Reproducible Example
Happening incases of high load ( High parallel inference requests )
Additional
No response
Are you willing to submit a PR?