Closed xrstokes closed 3 years ago
Hi please format your code properly next time, that will make things easier. I guess something like this works for an image. And whether you need cv2.cvtColor(source, cv2.COLOR_BGR2RGB)
depends on how you read source
. I've simplified the loop a bit to the point where only the bounding box is drawn on im0s
. This code is not tested
color = (255, 0, 0)
thickness = 1
t0 = time.time()
r = []
# Run inference
if device.type != 'cpu':
model(torch.zeros(1, 3, imgsz, imgsz).to(device).type_as(next(model.parameters()))) # run once
t1 = time.time()
im0s = cv2.cvtColor(source, cv2.COLOR_BGR2RGB)
img = letterbox(im0s, imgsz, stride=stride)[0]
img = img[:, :, ::-1].transpose(2, 0, 1) # BGR to RGB, to 3x416x416
img = np.ascontiguousarray(img)
img = torch.from_numpy(img).to(device)
img = img.float()
img /= 255.0 # 0 - 255 to 0.0 - 1.0
if img.ndimension() == 3:
img = img.unsqueeze(0)
# Inference
t1 = time_synchronized()
pred = model(img, augment=opt.augment)[0]
# Apply NMS
pred = non_max_suppression(pred, new_conf_thres, new_iou_thres, classes=opt.classes, agnostic=opt.agnostic_nms)
t2 = time_synchronized()
# Process detections
for i, det in enumerate(pred):
if len(det):
det[:, :4] = scale_coords(img.shape[2:], det[:, :4], im0.shape).round()
for *xyxy, conf, cls in reversed(det):
c1, c2 = (int(xyxy[0]), int(xyxy[1])), (int(xyxy[2]), int(xyxy[3]))
im0s = cv2.rectangle(im0s, c1, c2, color, thickness)
cv2.imshow("yikes", im0s)
cv2.waitKey(1)
Thanks for the help. I've got it working with minor tweaks. See Below. The problem is that i only gained 25ms on my Nvidia Jetson nano. The first steps (t0), take 90ms and the actual inference takes another 90ms so total time is 180ms or ~5 FPS. I thought that the previous 90ms was writing to the SD card. But obviously there is quite some time in prepping the image. Any tips? is it possible to do away with numpy steps? I assume that is the speed issue. Thanks in advance and I'm sorry for not formatting the code because I couldn't find how.
Also what does this line do "img = img[:, :, ::-1].transpose(2, 0, 1) # BGR to RGB, to 3x416x416" Because my image size is 640?
def yolo_detect(source, new_conf_thres, new_iou_thres):
t0 = time.time()
r = []
# Run inference
if device.type != 'cpu':
model(torch.zeros(1, 3, imgsz, imgsz).to(device).type_as(next(model.parameters()))) # run once
t1 = time.time()
im0 = source
img = letterbox_image(im0, imgsz)
img = img[:, :, ::-1].transpose(2, 0, 1) # BGR to RGB, to 3x416x416
img = np.ascontiguousarray(img)
img = torch.from_numpy(img).to(device)
img = img.half() if half else img.float() # uint8 to fp16/32
img /= 255.0 # 0 - 255 to 0.0 - 1.0
if img.ndimension() == 3:
img = img.unsqueeze(0)
# Inference
t1 = time_synchronized()
pred = model(img, augment=opt.augment)[0]
# Apply NMS
pred = non_max_suppression(pred, new_conf_thres, new_iou_thres, classes=opt.classes, agnostic=opt.agnostic_nms)
t2 = time_synchronized()
# Process detections
for i, det in enumerate(pred): # detections per image
gn = torch.tensor(im0.shape)[[1, 0, 1, 0]] # normalization gain whwh
imc = im0.copy() if opt.save_crop else im0 # for opt.save_crop
if len(det):
# Rescale boxes from img_size to im0 size
det[:, :4] = scale_coords(img.shape[2:], det[:, :4], im0.shape).round()
# Write results
for *xyxy, conf, cls in reversed(det):
r.append({'name':f'{names[int(cls)]}',
'conf':f'{conf:.2f}',
'x':int(float(f'{xyxy[0]:.2f}')),
'y':int(float(f'{xyxy[1]:.2f}')),
'w':int(float(f'{xyxy[2]:.2f}') - float(f'{xyxy[0]:.2f}')),
'h':int(float(f'{xyxy[3]:.2f}') - float(f'{xyxy[1]:.2f}')),
})
print(f'Done. ({time.time() - t0:.3f}s)')
print(f'Done. ({time.time() - t1:.3f}s)')
return r
The comment
3x416x416
was only copy pasted from the detect.py, you can ignore that. img[:, :, ::-1].transpose(2, 0, 1)
expects your image to be in BGR format and converts it to RGB. transpose(2, 0, 1)
needs to be done since pytorch convolutions expect the channels to be at the first dimension, therefore the image shape is transformed to (channl,width,heigt)
.
@xrstokes none of the operations you are showing here are required for YOLOv5 inference. I would simply load a YOLOv5 PyTorch Hub models and pass it your image, nothing else required. See PyTorch Hub tutorial to get started.
import torch
# Model
model = torch.hub.load('ultralytics/yolov5', 'yolov5s')
# Image
img = 'https://ultralytics.com/images/zidane.jpg'
# Inference
results = model(img)
results.print() # or .show(), .save()
π Hello, this issue has been automatically marked as stale because it has not had recent activity. Please note it will be closed if no further activity occurs.
Access additional YOLOv5 π resources:
Access additional Ultralytics β‘ resources:
Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!
Thank you for your contributions to YOLOv5 π and Vision AI β!
First i want to say thanks for all the great work. I'm just trying to use yolov5 in my own system and I'm having trouble using it. In my code i find i have to write a jpg to use it.
I can't make it work without writing a jpg to the file system and opening it again. and that takes longer than the inference it's slef. Can someone show me or point to me where I can find the answer. Been trying all day and this code works perfectly except the writing and reading part being slow.
Thanks in advance.
`def yolo_detect(source, new_conf_thres, new_iou_thres):
more context.....
`def yolo_model(new_weights): global model global stride global imgsz global names global device global half global view_img global save_txt global webcam global save_dir global save_img
def yolo_detect(source, new_conf_thres, new_iou_thres):
`