Closed drawntogetha closed 2 years ago
👋 Hello @drawntogetha, thank you for your interest in YOLOv5 🚀! Please visit our ⭐️ Tutorials to get started, where you can find quickstart guides for simple tasks like Custom Data Training all the way to advanced concepts like Hyperparameter Evolution.
If this is a 🐛 Bug Report, please provide screenshots and minimum viable code to reproduce your issue, otherwise we can not help you.
If this is a custom training ❓ Question, please provide as much information as possible, including dataset images, training logs, screenshots, and a public link to online W&B logging if available.
For business inquiries or professional support requests please visit https://ultralytics.com or email support@ultralytics.com.
Python>=3.7.0 with all requirements.txt installed including PyTorch>=1.7. To get started:
git clone https://github.com/ultralytics/yolov5 # clone
cd yolov5
pip install -r requirements.txt # install
YOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):
If this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are currently passing. CI tests verify correct operation of YOLOv5 training (train.py), validation (val.py), inference (detect.py) and export (export.py) on MacOS, Windows, and Ubuntu every 24 hours and on every commit.
@drawntogetha 👋 Hello! Thanks for asking about cropping results with YOLOv5 🚀. Cropping bounding box detections can be useful for training classification models on box contents for example. This feature was added in PR https://github.com/ultralytics/yolov5/pull/2827. You can crop detections using either detect.py or YOLOv5 PyTorch Hub:
Crops will be saved under runs/detect/exp/crops
, with a directory for each class detected.
python detect.py --save-crop
Crops will be saved under runs/detect/exp/crops
if save=True
, and also returned as a dictionary with crops as numpy arrays.
import torch
# Model
model = torch.hub.load('ultralytics/yolov5', 'yolov5s') # or yolov5m, yolov5l, yolov5x, custom
# Images
img = 'https://ultralytics.com/images/zidane.jpg' # or file, Path, PIL, OpenCV, numpy, list
# Inference
results = model(img)
# Results
crops = results.crop(save=True) # or .show(), .save(), .print(), .pandas(), etc.
Good luck 🍀 and let us know if you have any other questions!
@glenn-jocher Thank you for the reply! I'll try this and re-open the issue if needed.
Hi again!
@glenn-jocher I have tried to utilise save_one_box
function from plots .py so that I can replace the detected objects with my own image. Problem is that I need to crop & replace the bounding box detections in a video/webcam feed. I've looked into the inference with --save-crop, but I am not realising how should I utilise it for realtime replacement of the object.
I believe that I need to define xyxy corners in my image and modify:
crop = im[int(xyxy[0, 1]):int(xyxy[0, 3]), int(xyxy[0, 0]):int(xyxy[0, 2]), ::(1 if BGR else -1)]
so that I am overwriting the detections with my own image.
@drawntogetha 👋 Hello! Thanks for asking about handling inference results. YOLOv5 🚀 PyTorch Hub models allow for simple model loading and inference in a pure python environment without using detect.py
.
This example loads a pretrained YOLOv5s model from PyTorch Hub as model
and passes an image for inference. 'yolov5s'
is the YOLOv5 'small' model. For details on all available models please see the README. Custom models can also be loaded, including custom trained PyTorch models and their exported variants, i.e. ONNX, TensorRT, TensorFlow, OpenVINO YOLOv5 models.
import torch
# Model
model = torch.hub.load('ultralytics/yolov5', 'yolov5s') # or yolov5m, yolov5l, yolov5x, etc.
# model = torch.hub.load('ultralytics/yolov5', 'custom', 'path/to/best.pt') # custom trained model
# Images
im = 'https://ultralytics.com/images/zidane.jpg' # or file, Path, URL, PIL, OpenCV, numpy, list
# Inference
results = model(im)
# Results
results.print() # or .show(), .save(), .crop(), .pandas(), etc.
results.xyxy[0] # im predictions (tensor)
results.pandas().xyxy[0] # im predictions (pandas)
# xmin ymin xmax ymax confidence class name
# 0 749.50 43.50 1148.0 704.5 0.874023 0 person
# 2 114.75 195.75 1095.0 708.0 0.624512 0 person
# 3 986.00 304.00 1028.0 420.0 0.286865 27 tie
See YOLOv5 PyTorch Hub Tutorial for details.
Good luck 🍀 and let us know if you have any other questions!
Still no success :(
I've managed to overwrite
--save-crop
So that it saves my own image instead of the content of the bboxes. So now if I run detect.py --save-crop
it does save my own image instead of the cropped detected image.
I've tried to change the save_img,
with the code from save_one_box
, where numpimg
would be my own image. I am aware that this is a disgusting code re-usage but I am desperate to make it work:
if save_img or save_crop or view_img: # Add bbox to image
c = int(cls) # integer class
label = None if hide_labels else (names[c] if hide_conf else f'{xyxy[0]},{xyxy[1]}')#{names[c]} {conf:.2f}
xywh = (xyxy2xywh(torch.tensor(xyxy).view(1, 4)) / gn).view(-1).tolist() # normalized xywh
#b = xywh
#b[:, 2:] = b[:, 2:].max(1)[0].unsqueeze(1) # attempt rectangle to square
#b[:, 2:] = b[:, 2:] * gn + 10 # box wh * gain + pad
#xyxy = xywh2xyxy(b).long()
xyxy = torch.tensor(xyxy).view(-1, 4)
b = xyxy2xywh(xyxy) # boxes
#square == False;
#if square:
#b[:, 2:] = b[:, 2:].max(1)[0].unsqueeze(1) # attempt rectangle to square
gain=1.02
pad=10
b[:, 2:] = b[:, 2:] * gain + pad # box wh * gain + pad
clip_coords(xyxy, im.shape)
crop = im[int(xyxy[0, 1]):int(xyxy[0, 3]), int(xyxy[0, 0]):int(xyxy[0, 2])]
crop = numpimg
annotator.box_label(crop, label, color=colors(c, True))#xyxy to X
This yields me an error:
File "C:\Users\Drawntogetha\Desktop\yolorepository\yolov5\detect.py", line 206, in run crop = im[int(xyxy[0]):int(xyxy[1]), int(xyxy[2]):int(xyxy[3])] ValueError: only one element tensors can be converted to Python scalars
I feel like I'm walking blind and the answer is somewhere right in front of me, but I am not seeing it :)
Just to make myself clear: my aim is to overlay a detected object with an image in the video feed.
Or actually, I'm thinking I have to deal with im0 = annotator.result(
) as the working data.
I've found a stackoverflow question which seem to deal with a similar problem: https://stackoverflow.com/questions/57262520/replacing-a-solid-green-region-with-another-image-with-opencv
In the code there are hardcoded coordinates of the target bounding box, and when I try to run it with those it places my image at that custom location (static image in the video feed). I can define the corners of my image and make a list of these points.
After, I want to do the [cv.getPerspectiveTransform] (https://docs.opencv.org/4.x/da/d54/groupimgproctransform.html#gae66ba39ba2e47dd0750555c7e986ab85)
to map my image onto the corners of the bounding box. Problem is that how do I get those corners? I know that they are stored in the xyxy
, but how do I retrieve each one of them?
Please, take a look here if my logic is sane: `im0 = annotator.result()
if view_img:
pts_src = np.float32([[0, 0], [325, 0], [325, 472], [0, 472]])
#pts_dst = im0.np([x1:]),[y1:],[x2:],[y2:])`
Where pts_dst
would be the corners of the yolo bounding boxes.
I did a workaround using this guys code https://www.learnpythonwithrune.org/opencv-python-webcam-how-to-track-and-replace-object/. The result is rather ugly, but it accomplishes what I have intended.
Search before asking
Question
Hello!
I have trained my own custom detector and now I would like to put a mask on top of the bounding box around the detected objects during inference. In order to do so, I have tried to modify the detect.py, namely the part about "Process predictions":
This yields an error, as the input arguments don't match: I have no clue what datatype im0 is supposed to be, while my mask is RGB image.
I have seen that there is a way to crop and store the bounding boxes in utils.plots, so I thought that I could use or modify that function, however my skills limit me from doing that.
Please, let me know how can I accomplish this task. I have searched the web and I think that the approach is to:
However, I am getting errors as I don't quite know how to do this.
Additional
Also, I have tried to accomplish the image overlay in this part of the code (Process predictions, write the results)