Closed mriamnobody closed 1 year ago
Hi @mriamnobody
dummy_input
in the torch.onnx.export command is a tensor of the same size as your model's expected input. It's used to trace the operations performed on the input, which is used for constructing the ONNX graph.
This dummy input doesn't contain any meaningful data; it's used only to mimic the shape and data type of the actual inputs your model expects.
In case of yolo_nas, it is (3, 640, 640)
images:
import torch
from super_gradients.common.object_names import Models
from super_gradients.training import models
model = models.get(Models.YOLO_NAS_L, pretrained_weights="coco")
model.prep_model_for_conversion(input_size=(640, 640))
onnx_input = torch.zeros((3, 640, 640))
torch.onnx.export(model, onnx_input, f="yolo_nas_l.onnx")
Concerning your first question, we only support single stream prediction out of the box, but you can write your own script to support it:
import cv2
from super_gradients.common.object_names import Models
from super_gradients.training import models
# Note that currently only YoloX, PPYoloE and YOLO-NAS are supported.
model = models.get(Models.YOLO_NAS_L, pretrained_weights="coco")
# Here I am drawing on the same camera, but you can change.
cap1 = cv2.VideoCapture(cv2.CAP_ANY)
cap2 = cv2.VideoCapture(cv2.CAP_ANY)
while True:
# Read frames from the cameras
ret1, frame1 = cap1.read()
ret2, frame2 = cap2.read()
# Check if frames are successfully captured
if not ret1 or not ret2:
break
# Running predict on the 2 sources at the same time will improve processing speed.
predictions = model.predict([frame1, frame2])
for i, predicted_frame in enumerate(predictions):
cv2.imshow(f"Camera {i}: ", predicted_frame.draw())
# Check for key press
if cv2.waitKey(1) & 0xFF == ord("q"):
break
# Release the video capture objects and close the windows
cap1.release()
cap2.release()
cv2.destroyAllWindows()
@mriamnobody does that answer your question :)
Hi @mriamnobody
dummy_input
in the torch.onnx.export command is a tensor of the same size as your model's expected input. It's used to trace the operations performed on the input, which is used for constructing the ONNX graph.This dummy input doesn't contain any meaningful data; it's used only to mimic the shape and data type of the actual inputs your model expects.
In case of yolo_nas, it is
(3, 640, 640)
images:import torch from super_gradients.common.object_names import Models from super_gradients.training import models model = models.get(Models.YOLO_NAS_L, pretrained_weights="coco") model.prep_model_for_conversion(input_size=(640, 640)) onnx_input = torch.zeros((3, 640, 640)) torch.onnx.export(model, onnx_input, f="yolo_nas_l.onnx")
Now it makes some sense to me. Thank you @Louis-Dupont for your guidance and help.
Concerning your first question, we only support single stream prediction out of the box, but you can write your own script to support it:
import cv2 from super_gradients.common.object_names import Models from super_gradients.training import models # Note that currently only YoloX, PPYoloE and YOLO-NAS are supported. model = models.get(Models.YOLO_NAS_L, pretrained_weights="coco") # Here I am drawing on the same camera, but you can change. cap1 = cv2.VideoCapture(cv2.CAP_ANY) cap2 = cv2.VideoCapture(cv2.CAP_ANY) while True: # Read frames from the cameras ret1, frame1 = cap1.read() ret2, frame2 = cap2.read() # Check if frames are successfully captured if not ret1 or not ret2: break # Running predict on the 2 sources at the same time will improve processing speed. predictions = model.predict([frame1, frame2]) for i, predicted_frame in enumerate(predictions): cv2.imshow(f"Camera {i}: ", predicted_frame.draw()) # Check for key press if cv2.waitKey(1) & 0xFF == ord("q"): break # Release the video capture objects and close the windows cap1.release() cap2.release() cv2.destroyAllWindows()
@Louis-Dupont, Wow, the script is so intuitive, clean and compact yet powerful. Does the line predictions = model.predict([frame1, frame2])
perform detection/prediction concurrently on the two streams? In case if I want will the model be able to handle more than 2 Ip cameras concurrently? If multiple IP cameras are used will it affect the quality/precision/accuracy of detection/prediction?
The model.predict
runs by batch of up to 32 images at once, so it is faster to run model.predict([frame1, frame2])
than running sequentially model.predict([frame1])
and then model.predict([frame2])
, especially on GPU.
That being said, running the 2 predicts together means that the fps of both streams will be the same.
The alternative would be to run 1 process per stream, in parallel. I think this would decrease fps overall, but it might be worth trying.
@Louis-Dupont, The script ran successfully.
Hi @mriamnobody
dummy_input
in the torch.onnx.export command is a tensor of the same size as your model's expected input. It's used to trace the operations performed on the input, which is used for constructing the ONNX graph.This dummy input doesn't contain any meaningful data; it's used only to mimic the shape and data type of the actual inputs your model expects.
In case of yolo_nas, it is
(3, 640, 640)
images:import torch from super_gradients.common.object_names import Models from super_gradients.training import models model = models.get(Models.YOLO_NAS_L, pretrained_weights="coco") model.prep_model_for_conversion(input_size=(640, 640)) onnx_input = torch.zeros((3, 640, 640)) torch.onnx.export(model, onnx_input, f="yolo_nas_l.onnx")
For the following script, I got the error:
Traceback (most recent call last):
File "/mnt/c/Users/rosha/Downloads/Compressed/super-gradients-3.1.1/test.py", line 143, in <module>
torch.onnx.export(model, onnx_input, f="yolo_nas_l.onnx")
File "/mnt/c/Users/rosha/Downloads/Compressed/super-gradients-3.1.1/yolovnass/lib/python3.10/site-packages/torch/onnx/utils.py", line 504, in export
_export(
File "/mnt/c/Users/rosha/Downloads/Compressed/super-gradients-3.1.1/yolovnass/lib/python3.10/site-packages/torch/onnx/utils.py", line 1529, in _export
graph, params_dict, torch_out = _model_to_graph(
File "/mnt/c/Users/rosha/Downloads/Compressed/super-gradients-3.1.1/yolovnass/lib/python3.10/site-packages/torch/onnx/utils.py", line 1111, in _model_to_graph
graph, params, torch_out, module = _create_jit_graph(model, args)
File "/mnt/c/Users/rosha/Downloads/Compressed/super-gradients-3.1.1/yolovnass/lib/python3.10/site-packages/torch/onnx/utils.py", line 987, in _create_jit_graph
graph, torch_out = _trace_and_get_graph_from_model(model, args)
File "/mnt/c/Users/rosha/Downloads/Compressed/super-gradients-3.1.1/yolovnass/lib/python3.10/site-packages/torch/onnx/utils.py", line 891, in _trace_and_get_graph_from_model
trace_graph, torch_out, inputs_states = torch.jit._get_trace_graph(
File "/mnt/c/Users/rosha/Downloads/Compressed/super-gradients-3.1.1/yolovnass/lib/python3.10/site-packages/torch/jit/_trace.py", line 1184, in _get_trace_graph
outs = ONNXTracedModule(f, strict, _force_outplace, return_inputs, _return_inputs_states)(*args, **kwargs)
File "/mnt/c/Users/rosha/Downloads/Compressed/super-gradients-3.1.1/yolovnass/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/mnt/c/Users/rosha/Downloads/Compressed/super-gradients-3.1.1/yolovnass/lib/python3.10/site-packages/torch/jit/_trace.py", line 127, in forward
graph, out = torch._C._create_graph_by_tracing(
File "/mnt/c/Users/rosha/Downloads/Compressed/super-gradients-3.1.1/yolovnass/lib/python3.10/site-packages/torch/jit/_trace.py", line 118, in wrapper
outs.append(self.inner(*trace_inputs))
File "/mnt/c/Users/rosha/Downloads/Compressed/super-gradients-3.1.1/yolovnass/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/mnt/c/Users/rosha/Downloads/Compressed/super-gradients-3.1.1/yolovnass/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1182, in _slow_forward
result = self.forward(*input, **kwargs)
File "/mnt/c/Users/rosha/Downloads/Compressed/super-gradients-3.1.1/yolovnass/lib/python3.10/site-packages/super_gradients/training/models/detection_models/customizable_detector.py", line 84, in forward
x = self.backbone(x)
File "/mnt/c/Users/rosha/Downloads/Compressed/super-gradients-3.1.1/yolovnass/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/mnt/c/Users/rosha/Downloads/Compressed/super-gradients-3.1.1/yolovnass/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1182, in _slow_forward
result = self.forward(*input, **kwargs)
File "/mnt/c/Users/rosha/Downloads/Compressed/super-gradients-3.1.1/yolovnass/lib/python3.10/site-packages/super_gradients/modules/detection_modules.py", line 80, in forward
x = getattr(self, layer)(x)
File "/mnt/c/Users/rosha/Downloads/Compressed/super-gradients-3.1.1/yolovnass/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/mnt/c/Users/rosha/Downloads/Compressed/super-gradients-3.1.1/yolovnass/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1182, in _slow_forward
result = self.forward(*input, **kwargs)
File "/mnt/c/Users/rosha/Downloads/Compressed/super-gradients-3.1.1/yolovnass/lib/python3.10/site-packages/super_gradients/training/models/detection_models/yolo_nas/yolo_stages.py", line 138, in forward
return self.conv(x)
File "/mnt/c/Users/rosha/Downloads/Compressed/super-gradients-3.1.1/yolovnass/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/mnt/c/Users/rosha/Downloads/Compressed/super-gradients-3.1.1/yolovnass/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1182, in _slow_forward
result = self.forward(*input, **kwargs)
File "/mnt/c/Users/rosha/Downloads/Compressed/super-gradients-3.1.1/yolovnass/lib/python3.10/site-packages/super_gradients/modules/qarepvgg_block.py", line 182, in forward
return self.se(self.nonlinearity(self.post_bn(self.rbr_reparam(inputs))))
File "/mnt/c/Users/rosha/Downloads/Compressed/super-gradients-3.1.1/yolovnass/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/mnt/c/Users/rosha/Downloads/Compressed/super-gradients-3.1.1/yolovnass/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1182, in _slow_forward
result = self.forward(*input, **kwargs)
File "/mnt/c/Users/rosha/Downloads/Compressed/super-gradients-3.1.1/yolovnass/lib/python3.10/site-packages/torch/nn/modules/batchnorm.py", line 138, in forward
self._check_input_dim(input)
File "/mnt/c/Users/rosha/Downloads/Compressed/super-gradients-3.1.1/yolovnass/lib/python3.10/site-packages/torch/nn/modules/batchnorm.py", line 410, in _check_input_dim
raise ValueError("expected 4D input (got {}D input)".format(input.dim()))
ValueError: expected 4D input (got 3D input)
The
model.predict
runs by batch of up to 32 images at once, so it is faster to runmodel.predict([frame1, frame2])
than running sequentiallymodel.predict([frame1])
and thenmodel.predict([frame2])
, especially on GPU. That being said, running the 2 predicts together means that the fps of both streams will be the same. The alternative would be to run 1 process per stream, in parallel. I think this would decrease fps overall, but it might be worth trying.
Is the prediction on the frames done in realtime? I have observed that the prediction is lagging too much from the IP camera. For example If the time is 12:28:30 (HH:MM:SS) on IP camera then the detection frame displayed has time 12:21:11 (HH:MM:SS) roughly 7 minutes behind.
@Louis-Dupont Is there a way to only look for humans in the frame for detection?
@Louis-Dupont, The script ran successfully.
Hi @mriamnobody
dummy_input
in the torch.onnx.export command is a tensor of the same size as your model's expected input. It's used to trace the operations performed on the input, which is used for constructing the ONNX graph. This dummy input doesn't contain any meaningful data; it's used only to mimic the shape and data type of the actual inputs your model expects. In case of yolo_nas, it is(3, 640, 640)
images:import torch from super_gradients.common.object_names import Models from super_gradients.training import models model = models.get(Models.YOLO_NAS_L, pretrained_weights="coco") model.prep_model_for_conversion(input_size=(640, 640)) onnx_input = torch.zeros((3, 640, 640)) torch.onnx.export(model, onnx_input, f="yolo_nas_l.onnx")
For the following script, I got the error:
Traceback (most recent call last): File "/mnt/c/Users/rosha/Downloads/Compressed/super-gradients-3.1.1/test.py", line 143, in <module> torch.onnx.export(model, onnx_input, f="yolo_nas_l.onnx") File "/mnt/c/Users/rosha/Downloads/Compressed/super-gradients-3.1.1/yolovnass/lib/python3.10/site-packages/torch/onnx/utils.py", line 504, in export _export( File "/mnt/c/Users/rosha/Downloads/Compressed/super-gradients-3.1.1/yolovnass/lib/python3.10/site-packages/torch/onnx/utils.py", line 1529, in _export graph, params_dict, torch_out = _model_to_graph( File "/mnt/c/Users/rosha/Downloads/Compressed/super-gradients-3.1.1/yolovnass/lib/python3.10/site-packages/torch/onnx/utils.py", line 1111, in _model_to_graph graph, params, torch_out, module = _create_jit_graph(model, args) File "/mnt/c/Users/rosha/Downloads/Compressed/super-gradients-3.1.1/yolovnass/lib/python3.10/site-packages/torch/onnx/utils.py", line 987, in _create_jit_graph graph, torch_out = _trace_and_get_graph_from_model(model, args) File "/mnt/c/Users/rosha/Downloads/Compressed/super-gradients-3.1.1/yolovnass/lib/python3.10/site-packages/torch/onnx/utils.py", line 891, in _trace_and_get_graph_from_model trace_graph, torch_out, inputs_states = torch.jit._get_trace_graph( File "/mnt/c/Users/rosha/Downloads/Compressed/super-gradients-3.1.1/yolovnass/lib/python3.10/site-packages/torch/jit/_trace.py", line 1184, in _get_trace_graph outs = ONNXTracedModule(f, strict, _force_outplace, return_inputs, _return_inputs_states)(*args, **kwargs) File "/mnt/c/Users/rosha/Downloads/Compressed/super-gradients-3.1.1/yolovnass/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl return forward_call(*input, **kwargs) File "/mnt/c/Users/rosha/Downloads/Compressed/super-gradients-3.1.1/yolovnass/lib/python3.10/site-packages/torch/jit/_trace.py", line 127, in forward graph, out = torch._C._create_graph_by_tracing( File "/mnt/c/Users/rosha/Downloads/Compressed/super-gradients-3.1.1/yolovnass/lib/python3.10/site-packages/torch/jit/_trace.py", line 118, in wrapper outs.append(self.inner(*trace_inputs)) File "/mnt/c/Users/rosha/Downloads/Compressed/super-gradients-3.1.1/yolovnass/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl return forward_call(*input, **kwargs) File "/mnt/c/Users/rosha/Downloads/Compressed/super-gradients-3.1.1/yolovnass/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1182, in _slow_forward result = self.forward(*input, **kwargs) File "/mnt/c/Users/rosha/Downloads/Compressed/super-gradients-3.1.1/yolovnass/lib/python3.10/site-packages/super_gradients/training/models/detection_models/customizable_detector.py", line 84, in forward x = self.backbone(x) File "/mnt/c/Users/rosha/Downloads/Compressed/super-gradients-3.1.1/yolovnass/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl return forward_call(*input, **kwargs) File "/mnt/c/Users/rosha/Downloads/Compressed/super-gradients-3.1.1/yolovnass/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1182, in _slow_forward result = self.forward(*input, **kwargs) File "/mnt/c/Users/rosha/Downloads/Compressed/super-gradients-3.1.1/yolovnass/lib/python3.10/site-packages/super_gradients/modules/detection_modules.py", line 80, in forward x = getattr(self, layer)(x) File "/mnt/c/Users/rosha/Downloads/Compressed/super-gradients-3.1.1/yolovnass/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl return forward_call(*input, **kwargs) File "/mnt/c/Users/rosha/Downloads/Compressed/super-gradients-3.1.1/yolovnass/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1182, in _slow_forward result = self.forward(*input, **kwargs) File "/mnt/c/Users/rosha/Downloads/Compressed/super-gradients-3.1.1/yolovnass/lib/python3.10/site-packages/super_gradients/training/models/detection_models/yolo_nas/yolo_stages.py", line 138, in forward return self.conv(x) File "/mnt/c/Users/rosha/Downloads/Compressed/super-gradients-3.1.1/yolovnass/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl return forward_call(*input, **kwargs) File "/mnt/c/Users/rosha/Downloads/Compressed/super-gradients-3.1.1/yolovnass/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1182, in _slow_forward result = self.forward(*input, **kwargs) File "/mnt/c/Users/rosha/Downloads/Compressed/super-gradients-3.1.1/yolovnass/lib/python3.10/site-packages/super_gradients/modules/qarepvgg_block.py", line 182, in forward return self.se(self.nonlinearity(self.post_bn(self.rbr_reparam(inputs)))) File "/mnt/c/Users/rosha/Downloads/Compressed/super-gradients-3.1.1/yolovnass/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl return forward_call(*input, **kwargs) File "/mnt/c/Users/rosha/Downloads/Compressed/super-gradients-3.1.1/yolovnass/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1182, in _slow_forward result = self.forward(*input, **kwargs) File "/mnt/c/Users/rosha/Downloads/Compressed/super-gradients-3.1.1/yolovnass/lib/python3.10/site-packages/torch/nn/modules/batchnorm.py", line 138, in forward self._check_input_dim(input) File "/mnt/c/Users/rosha/Downloads/Compressed/super-gradients-3.1.1/yolovnass/lib/python3.10/site-packages/torch/nn/modules/batchnorm.py", line 410, in _check_input_dim raise ValueError("expected 4D input (got {}D input)".format(input.dim())) ValueError: expected 4D input (got 3D input)
I batch size was missing in onnx_input = torch.zeros((3, 640, 640))
.
As you mentioned
The
model.predict
runs by batch of up to 32 images at once,
I used (32, 3, 640,640)
and the script successfully generated .onnx file
@Louis-Dupont , I have modified your script to add telegram alerts, error handling and logging. Please take a look. This also be useful if someone needs to implement telegram alerts. The only thing left to implement is to look for only humans/persons in the frame. Harpreet in Discord shared a link https://github.com/Deci-AI/super-gradients/issues/892 which has some related information, but it is too complex for me to understand and implement. The issue I have attached a link for also mentioned that Regarding filtering of classes - no, currently this is not supported.
I'd be grateful if you can help and look into the issue.
import cv2
import time
import asyncio
import logging
from super_gradients.training import models
from super_gradients.common.object_names import Models
from telegram import Bot
from telegram.error import TelegramError
# Set up logging
logging.basicConfig(filename='app.log',
filemode='w',
format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
level=logging.INFO)
logger = logging.getLogger(__name__)
TOKEN = 'YOUR_BOT_TOKEN'
CHAT_ID = 'YOUR_CHAT_ID'
bot = Bot(token=TOKEN)
async def send_message(text):
try:
await bot.send_message(chat_id=CHAT_ID, text=text)
except TelegramError as e:
logger.error("Failed to send message through Telegram with error: %s", e)
# Create a loop to run the async function in
loop = asyncio.get_event_loop()
loop.run_until_complete(send_message("Detection Program Started"))
def release_resources():
logger.info("Releasing video capture objects and closing windows")
cap1.release()
cap2.release()
cap3.release()
cap4.release()
cap5.release()
cv2.destroyAllWindows()
start_time = time.time()
logger.info("Script start time: %s", start_time)
try:
logger.info("Starting to load model")
model = models.get(Models.YOLO_NAS_L, pretrained_weights="coco").cuda()
logger.info("Detection Model loaded successfully")
loop.run_until_complete(send_message("Detection Model loaded successfully"))
except Exception as e:
logger.error("Detection Model loading failed with error: %s", e)
loop.run_until_complete(send_message("Model loading failed with error: " + str(e)))
try:
cap1 = cv2.VideoCapture('camstream1')
cap2 = cv2.VideoCapture('camstream2')
cap3 = cv2.VideoCapture('camstream3')
cap4 = cv2.VideoCapture('camstream4')
cap5 = cv2.VideoCapture('camstream5')
except Exception as e:
logger.error("VideoCapture initialization failed with error: %s", e)
loop.run_until_complete(send_message("VideoCapture initialization failed with error: " + str(e)))
while True:
try:
captures = [(cap1, 'cam_name1'), (cap2, 'cam_name2'), (cap3, 'cam_name3'), (cap4, 'cam_name4'), (cap5, 'cam_name5')]
frames = []
logger.info("Reading frames from cameras")
for cap, camera_name in captures:
ret, frame = cap.read()
if not ret:
loop.run_until_complete(send_message(f"Failed to read frames from camera {camera_name}"))
else:
frames.append(frame)
logger.info("Predicting frames")
predictions = model.predict(frames)
for i, (predicted_frame, (_, camera_name)) in enumerate(zip(predictions, captures)):
cv2.imshow(camera_name, predicted_frame.draw())
if cv2.waitKey(1) & 0xFF == ord("q"):
break
except Exception as e:
logger.error("An error occurred: %s", e)
loop.run_until_complete(send_message("An error occurred: " + str(e)))
time.sleep(5) # optional: wait before trying again
continue
release_resources()
end_time = time.time()
logger.info("Script end time: %s", end_time)
execution_time = end_time - start_time
logger.info("Script executed in: %s seconds", execution_time)
print(f"Script executed in: {execution_time} seconds")
loop.run_until_complete(send_message(f"Script executed in: {execution_time} seconds"))
I have also created and exported the model in .onnx
format, but I'm not sure how to use it in the code.
@Louis-Dupont @NatanBagrov
Hi @mriamnobody , as Harpreet said, we currently don't support it.
I invite you to have a look at our implementation of the draw()
method:https://github.com/Deci-AI/super-gradients/blob/a30fa8fdc623533df785831f7457967066fb2ebe/src/super_gradients/training/models/prediction_results.py#L44-L85
All you need to do is to create your own def draw_human(predicted_frame: ImageDetectionPrediction) -> np.ndarray
function, which would iterate over predicted_frame.prediction
the same way we do just with a condition when predicted_frame.class_names[class_id] == "human"
(make sure this is the exact name)
Hoping this helps
@Louis-Dupont. I created a small function, but now nothing is being detected (including humans):
def draw_human(self, box_thickness: int = 2, show_confidence: bool = True, color_mapping: Optional[List[Tuple[int, int, int]]] = None) -> np.ndarray:
"""Draw the predicted bboxes on the image for humans only.
:param box_thickness: Thickness of bounding boxes.
:param show_confidence: Whether to show confidence scores on the image.
:param color_mapping: List of tuples representing the colors for each class.
Default is None, which generates a default color mapping based on the number of class names.
:return: Image with predicted bboxes. Note that this does not modify the original image.
"""
image = self.image.copy()
color_mapping = color_mapping or generate_color_mapping(len(self.class_names))
for pred_i in range(len(self.prediction)):
class_id = int(self.prediction.labels[pred_i])
class_name = self.class_names[class_id]
# Skip if detected class is not 'human'
if class_name.lower() != 'human':
continue
score = "" if not show_confidence else str(round(self.prediction.confidence[pred_i], 2))
image = draw_bbox(
image=image,
title=f"{class_name} {score}",
color=color_mapping[class_id],
box_thickness=box_thickness,
x1=int(self.prediction.bboxes_xyxy[pred_i, 0]),
y1=int(self.prediction.bboxes_xyxy[pred_i, 1]),
x2=int(self.prediction.bboxes_xyxy[pred_i, 2]),
y2=int(self.prediction.bboxes_xyxy[pred_i, 3]),
)
return image
The edited prediction_results.py is now:
import os
from abc import ABC, abstractmethod
from typing import List, Optional, Tuple, Iterator
from dataclasses import dataclass
import numpy as np
from super_gradients.training.models.predictions import Prediction, DetectionPrediction
from super_gradients.training.utils.media.video import show_video_from_frames, save_video
from super_gradients.training.utils.media.image import show_image, save_image
from super_gradients.training.utils.visualization.utils import generate_color_mapping
from super_gradients.training.utils.visualization.detection import draw_bbox
@dataclass
class ImagePrediction(ABC):
"""Object wrapping an image and a model's prediction.
:attr image: Input image
:attr predictions: Predictions of the model
:attr class_names: List of the class names to predict
"""
image: np.ndarray
prediction: Prediction
class_names: List[str]
@abstractmethod
def draw(self, *args, **kwargs) -> np.ndarray:
"""Draw the predictions on the image."""
pass
@abstractmethod
def draw_human(self, *args, **kwargs) -> np.ndarray:
"""Draw the predictions on the image."""
pass
@abstractmethod
def show(self, *args, **kwargs) -> None:
"""Display the predictions on the image."""
pass
@abstractmethod
def save(self, *args, **kwargs) -> None:
"""Save the predictions on the image."""
pass
@dataclass
class ImageDetectionPrediction(ImagePrediction):
"""Object wrapping an image and a detection model's prediction.
:attr image: Input image
:attr predictions: Predictions of the model
:attr class_names: List of the class names to predict
"""
image: np.ndarray
prediction: DetectionPrediction
class_names: List[str]
def draw(self, box_thickness: int = 2, show_confidence: bool = True, color_mapping: Optional[List[Tuple[int, int, int]]] = None) -> np.ndarray:
"""Draw the predicted bboxes on the image.
:param box_thickness: Thickness of bounding boxes.
:param show_confidence: Whether to show confidence scores on the image.
:param color_mapping: List of tuples representing the colors for each class.
Default is None, which generates a default color mapping based on the number of class names.
:return: Image with predicted bboxes. Note that this does not modify the original image.
"""
image = self.image.copy()
color_mapping = color_mapping or generate_color_mapping(len(self.class_names))
for pred_i in range(len(self.prediction)):
class_id = int(self.prediction.labels[pred_i])
if self.class_names[class_id] != "human":
continue
score = "" if not show_confidence else str(round(self.prediction.confidence[pred_i], 2))
image = draw_bbox(
image=image,
title=f"{self.class_names[class_id]} {score}",
color=color_mapping[class_id],
box_thickness=box_thickness,
x1=int(self.prediction.bboxes_xyxy[pred_i, 0]),
y1=int(self.prediction.bboxes_xyxy[pred_i, 1]),
x2=int(self.prediction.bboxes_xyxy[pred_i, 2]),
y2=int(self.prediction.bboxes_xyxy[pred_i, 3]),
)
return image
def draw_human(self, box_thickness: int = 2, show_confidence: bool = True, color_mapping: Optional[List[Tuple[int, int, int]]] = None) -> np.ndarray:
"""Draw the predicted bboxes on the image for humans only.
:param box_thickness: Thickness of bounding boxes.
:param show_confidence: Whether to show confidence scores on the image.
:param color_mapping: List of tuples representing the colors for each class.
Default is None, which generates a default color mapping based on the number of class names.
:return: Image with predicted bboxes. Note that this does not modify the original image.
"""
image = self.image.copy()
color_mapping = color_mapping or generate_color_mapping(len(self.class_names))
for pred_i in range(len(self.prediction)):
class_id = int(self.prediction.labels[pred_i])
class_name = self.class_names[class_id]
# Skip if detected class is not 'human'
if class_name.lower() != 'human':
continue
score = "" if not show_confidence else str(round(self.prediction.confidence[pred_i], 2))
image = draw_bbox(
image=image,
title=f"{class_name} {score}",
color=color_mapping[class_id],
box_thickness=box_thickness,
x1=int(self.prediction.bboxes_xyxy[pred_i, 0]),
y1=int(self.prediction.bboxes_xyxy[pred_i, 1]),
x2=int(self.prediction.bboxes_xyxy[pred_i, 2]),
y2=int(self.prediction.bboxes_xyxy[pred_i, 3]),
)
return image
def show(self, box_thickness: int = 2, show_confidence: bool = True, color_mapping: Optional[List[Tuple[int, int, int]]] = None) -> None:
"""Display the image with predicted bboxes.
:param box_thickness: Thickness of bounding boxes.
:param show_confidence: Whether to show confidence scores on the image.
:param color_mapping: List of tuples representing the colors for each class.
Default is None, which generates a default color mapping based on the number of class names.
"""
image = self.draw(box_thickness=box_thickness, show_confidence=show_confidence, color_mapping=color_mapping)
show_image(image)
def save(self, output_path: str, box_thickness: int = 2, show_confidence: bool = True, color_mapping: Optional[List[Tuple[int, int, int]]] = None) -> None:
"""Save the predicted bboxes on the images.
:param output_path: Path to the output video file.
:param box_thickness: Thickness of bounding boxes.
:param show_confidence: Whether to show confidence scores on the image.
:param color_mapping: List of tuples representing the colors for each class.
Default is None, which generates a default color mapping based on the number of class names.
"""
image = self.draw(box_thickness=box_thickness, show_confidence=show_confidence, color_mapping=color_mapping)
save_image(image=image, path=output_path)
@dataclass
class ImagesPredictions(ABC):
"""Object wrapping the list of image predictions.
:attr _images_prediction_lst: List of results of the run
"""
_images_prediction_lst: List[ImagePrediction]
def __len__(self) -> int:
return len(self._images_prediction_lst)
def __getitem__(self, index: int) -> ImagePrediction:
return self._images_prediction_lst[index]
def __iter__(self) -> Iterator[ImagePrediction]:
return iter(self._images_prediction_lst)
@abstractmethod
def show(self, *args, **kwargs) -> None:
"""Display the predictions on the images."""
pass
@abstractmethod
def save(self, *args, **kwargs) -> None:
"""Save the predictions on the images."""
pass
@dataclass
class VideoPredictions(ImagesPredictions, ABC):
"""Object wrapping the list of image predictions as a Video.
:attr _images_prediction_lst: List of results of the run
:att fps: Frames per second of the video
"""
_images_prediction_lst: List[ImagePrediction]
fps: float
@abstractmethod
def show(self, *args, **kwargs) -> None:
"""Display the predictions on the video."""
pass
@abstractmethod
def save(self, *args, **kwargs) -> None:
"""Save the predictions on the video."""
pass
@dataclass
class ImagesDetectionPrediction(ImagesPredictions):
"""Object wrapping the list of image detection predictions.
:attr _images_prediction_lst: List of the predictions results
"""
_images_prediction_lst: List[ImageDetectionPrediction]
def show(self, box_thickness: int = 2, show_confidence: bool = True, color_mapping: Optional[List[Tuple[int, int, int]]] = None) -> None:
"""Display the predicted bboxes on the images.
:param box_thickness: Thickness of bounding boxes.
:param show_confidence: Whether to show confidence scores on the image.
:param color_mapping: List of tuples representing the colors for each class.
Default is None, which generates a default color mapping based on the number of class names.
"""
for prediction in self._images_prediction_lst:
prediction.show(box_thickness=box_thickness, show_confidence=show_confidence, color_mapping=color_mapping)
def save(
self, output_folder: str, box_thickness: int = 2, show_confidence: bool = True, color_mapping: Optional[List[Tuple[int, int, int]]] = None
) -> None:
"""Save the predicted bboxes on the images.
:param output_folder: Folder path, where the images will be saved.
:param box_thickness: Thickness of bounding boxes.
:param show_confidence: Whether to show confidence scores on the image.
:param color_mapping: List of tuples representing the colors for each class.
Default is None, which generates a default color mapping based on the number of class names.
"""
if output_folder:
os.makedirs(output_folder, exist_ok=True)
for i, prediction in enumerate(self._images_prediction_lst):
image_output_path = os.path.join(output_folder, f"pred_{i}.jpg")
prediction.save(output_path=image_output_path, box_thickness=box_thickness, show_confidence=show_confidence, color_mapping=color_mapping)
@dataclass
class VideoDetectionPrediction(VideoPredictions):
"""Object wrapping the list of image detection predictions as a Video.
:attr _images_prediction_lst: List of the predictions results
:att fps: Frames per second of the video
"""
_images_prediction_lst: List[ImageDetectionPrediction]
fps: int
def draw(self, box_thickness: int = 2, show_confidence: bool = True, color_mapping: Optional[List[Tuple[int, int, int]]] = None) -> List[np.ndarray]:
"""Draw the predicted bboxes on the images.
:param box_thickness: Thickness of bounding boxes.
:param show_confidence: Whether to show confidence scores on the image.
:param color_mapping: List of tuples representing the colors for each class.
Default is None, which generates a default color mapping based on the number of class names.
:return: List of images with predicted bboxes. Note that this does not modify the original image.
"""
frames_with_bbox = [
result.draw(box_thickness=box_thickness, show_confidence=show_confidence, color_mapping=color_mapping) for result in self._images_prediction_lst
]
return frames_with_bbox
def draw_human(self, box_thickness: int = 2, show_confidence: bool = True, color_mapping: Optional[List[Tuple[int, int, int]]] = None) -> List[np.ndarray]:
"""Draw the predicted bboxes on the images.
:param box_thickness: Thickness of bounding boxes.
:param show_confidence: Whether to show confidence scores on the image.
:param color_mapping: List of tuples representing the colors for each class.
Default is None, which generates a default color mapping based on the number of class names.
:return: List of images with predicted bboxes. Note that this does not modify the original image.
"""
frames_with_bbox = [
result.draw(box_thickness=box_thickness, show_confidence=show_confidence, color_mapping=color_mapping) for result in self._images_prediction_lst
]
return frames_with_bbox
def show(self, box_thickness: int = 2, show_confidence: bool = True, color_mapping: Optional[List[Tuple[int, int, int]]] = None) -> None:
"""Display the predicted bboxes on the images.
:param box_thickness: Thickness of bounding boxes.
:param show_confidence: Whether to show confidence scores on the image.
:param color_mapping: List of tuples representing the colors for each class.
Default is None, which generates a default color mapping based on the number of class names.
"""
frames = self.draw(box_thickness=box_thickness, show_confidence=show_confidence, color_mapping=color_mapping)
show_video_from_frames(window_name="Detection", frames=frames, fps=self.fps)
def save(self, output_path: str, box_thickness: int = 2, show_confidence: bool = True, color_mapping: Optional[List[Tuple[int, int, int]]] = None) -> None:
"""Save the predicted bboxes on the images.
:param output_path: Path to the output video file.
:param box_thickness: Thickness of bounding boxes.
:param show_confidence: Whether to show confidence scores on the image.
:param color_mapping: List of tuples representing the colors for each class.
Default is None, which generates a default color mapping based on the number of class names.
"""
frames = self.draw(box_thickness=box_thickness, show_confidence=show_confidence, color_mapping=color_mapping)
save_video(output_path=output_path, frames=frames, fps=self.fps)
@Louis-Dupont
Isn't the class name person
instead of human
? DId you try to debug the function to see the predicted classes ?
If you debug the function you will be able to see which classes are detected and it could help you understand if the issue is that human
are not detected, or that human
is not a class_name and that it is instead person
.
Closing due to inactivity
I have recently found this excellent repository through an article on the internet. I tried my hands to implement what I need I need but have failed. I have these issues:
1) How can I use multiple IP camera streams as a source? 2) Is there a way to use a pre-trained model in .onnx format?
Please forgive me for my noobiness.
Edit:
One more thing, the command to convert the pre-trained model to .onnx format uses word dummy input
torch.onnx.export(model, dummy_input, "yolo_nas_m.onnx")
what is it?
Thank you.