Closed M3nxudo closed 7 months ago
The model_prediction = self.model.predict(input_image, iou=0.5, conf=self.conf)
line looks legit.
You can pass your as RGB (not BGR as OpenCV reads) image as numpy array to predict().
It could be you are simply getting no detections (Maybe your confThreshold is too high?).
So if there are no boxes you would get an index error at bboxes_xyxy[0]
which is expected.
A somewhat related issue where you can find a code snippet of printing all detections: https://github.com/Deci-AI/super-gradients/issues/1818
I double checked that i'm feeding the correct image type, tested with lower confidence (went from 0.7 down to 0.5), and even switched the loaded model to "yolo_nas_m" to make sure it wasn't the model itself and haven't been able to get any inferences working. I'm checking the size of prediction.bboxes_xyxy before accesing it so the program doesn't crush but every one of those is empty. So still not sure what i'm doing wrong. Additionally, checked issue #1818 as you suggested but that is working with image files not from images in memory.
Update: I've seemed to narrow down the rogue line of code that is causing all my headaches.
model = models.get(Models.YOLO_NAS_L, pretrained_weights="coco")
If i then do the inference and try to access the predictions, everything works as expected.
If instead i try to instantiate the model with gpu acceleration (with the following modification)
model = models.get(Models.YOLO_NAS_L, pretrained_weights="coco").cuda()
This is when I always get empty predictions.
Additional info:
Any help on this matter would be appreciated @BloodAxe
I don't see how it can be happening. Please double-check everything on your end. If you put this code to Colab with GPU and run you will get predictions as expected.
import cv2
import super_gradients
model_name = "yolo_nas_l"
model = super_gradients.training.models.get(model_name, pretrained_weights="coco").cuda()
image = cv2.imread(WHATEVER IMAGE)
image = image[:, :, ::-1]
model.predict(image).show()
Please share additional details what GPU you have and OS/python version you are using. If possible, provide a minimal yet complete code that reproduce your issue.
I'm providing some code that opens a videocapture from the webcam, extracts individual frames and tries to perform inference on both the CPU and GPU for comparison:
from super_gradients.training import models
from super_gradients.common.object_names import Models
import cv2
cap = cv2.VideoCapture(0)
cap.set(3, 640)
cap.set(4, 480)
modelcpu = models.get(Models.YOLO_NAS_L, pretrained_weights="coco")
modelgpu = models.get(Models.YOLO_NAS_L, pretrained_weights="coco").cuda()
cv2.namedWindow("CPU inference")
cv2.namedWindow("GPU inference")
cv2.moveWindow("CPU inference", 0, 20)
cv2.moveWindow("GPU inference", 700, 20)
# Acquisition loop
while True:
success, img = cap.read()
if not success:
break
output_image_cpu = img.copy()
output_image_gpu = img.copy()
resultscpu = modelcpu.predict(img)
resultsgpu = modelgpu.predict(img)
# CPU results
boxes_cpu = resultscpu.prediction.bboxes_xyxy
label_names_cpu = resultscpu.class_names
labels_cpu = resultscpu.prediction.labels
confidence_cpu = resultscpu.prediction.confidence
# GPU results
boxes_gpu = resultsgpu.prediction.bboxes_xyxy
label_names_gpu = resultsgpu.class_names
labels_gpu = resultsgpu.prediction.labels
confidence_gpu = resultsgpu.prediction.confidence
# CPU result filtering loop
if labels_cpu.size < 1:
text = "No detections on CPU"
print(text)
output_image_cpu = cv2.putText(output_image_cpu, text, (260, 20), cv2.FONT_HERSHEY_SIMPLEX,
0.5,(0, 0, 255), 2)
else:
count = -1
for lab in labels_cpu:
count += 1
# Extract info
local_label = label_names_gpu[labels_cpu[count]]
local_conf = confidence_cpu[count]
label = f"{local_label} ({local_conf:.2f})"
x1 = boxes_cpu[count, 0]
y1 = boxes_cpu[count, 1]
x2 = boxes_cpu[count, 2]
y2 = boxes_cpu[count, 3]
x1, y1, x2, y2 = int(x1), int(y1), int(x2), int(y2) # convert to int values
# Paint bboxes
output_image_cpu = cv2.rectangle(output_image_cpu, (x1, y1), (x2, y2), (255, 0, 0), 2)
output_image_cpu = cv2.putText(output_image_cpu, label, (x1 - 10, y1 - 10),cv2.FONT_HERSHEY_SIMPLEX,
0.5,(255, 0, 0), 2)
# GPU result filtering loop
if labels_gpu.size < 1:
text = "No detections on GPU"
print(text)
output_image_gpu = cv2.putText(output_image_gpu, text, (260, 20),cv2.FONT_HERSHEY_SIMPLEX,
0.5,(0, 0, 255), 2)
else:
count = -1
for lab in labels_gpu:
count += 1
# Extract info
local_label = label_names_gpu[labels_gpu[count]]
local_conf = confidence_gpu[count]
label = f"{local_label} ({local_conf:.2f})"
x1 = boxes_gpu[count, 0]
y1 = boxes_gpu[count, 1]
x2 = boxes_gpu[count, 2]
y2 = boxes_gpu[count, 3]
x1, y1, x2, y2 = int(x1), int(y1), int(x2), int(y2) # convert to int values
# Paint bboxes
output_image_gpu = cv2.rectangle(output_image_gpu, (x1, y1), (x2, y2), (255, 0, 0), 2)
output_image_gpu = cv2.putText(output_image_gpu, label, (x1 - 10, y1 - 10),cv2.FONT_HERSHEY_SIMPLEX,
0.5,(255, 0, 0), 2)
cv2.imshow("CPU inference", output_image_cpu)
cv2.imshow("GPU inference", output_image_gpu)
if cv2.waitKey(1) == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
Additional hardware information:
Package information (using a conda environment):
Hope that is enough to reproduce the behaviour @BloodAxe
EDIT: additional software info OS: Windows 10 Enterprise LTSC Python version: 3.10.13
Thanks for the detailed snippet to reproduce. Unfortunately I was not able to reproduce the issue yet. On my 4090 it works fine and predictions on CPU & GPU are identical. I will try later on 1070 which I happen to have and will let you know how it goes.
Update: Code works well on both 4090 and 1070 🤷♂️
Ok, probably this is where it all coming from https://github.com/pytorch/pytorch/issues/58123
On our end we will introduce an fp16
argument that you can use to disable fp16 inference mode.
We have a workaround PR to disable mixed precision used in model.predict()
which hopefully should fix your issue. This will land in a next release of SG.
But if you are really eager to try it out you can install development version of SG from feature branch using this command:
pip install -U git+https://github.com/Deci-AI/super-gradients@feature/SG-000-introduce-fp16-flag-to-predict
By changing line 24 of the snippet to:
resultsgpu = modelgpu.predict(img, fp16=False)
Everything is working as expected now (detections on gpu are finally up and running)
Thanks for the solution, awesome to see the Deci team continuously improving the super-gradients repo.
💡 Your Question
Is it possible to perform inference on an image already in memory (through Opencv videocapture)? I have tried to do so but get no detections when trying to perform inference in the following way:
Defining a class that has the model info and a method for performing the detection and returns the predictions
In doing so and debugging the code i get an error in the line: print (prediction.bboxes_xyxy[0]) with the following text: File "D:\source\repos\libMuinen_UI\nas_interface.py", line 26, in detect print (prediction.bboxes_xyxy[0]) IndexError: index 0 is out of bounds for axis 0 with size 0
Which if I understand correctly means that the prediction results are empty and can't access them but from the input image I know i should be getting at least some detections. Would love to know if i'm using incorrectly the predict method or any tips to make my code work.
Versions
No response