deepcam-cn / yolov5-face

YOLO5Face: Why Reinventing a Face Detector (https://arxiv.org/abs/2105.12931) ECCV Workshops 2022)
GNU General Public License v3.0
2.04k stars 495 forks source link

Unnecessary Rescaling in torch2trt/main.py #171

Open vjsrinivas opened 2 years ago

vjsrinivas commented 2 years ago

Hello, First off, I would like to thank you for your work and willingness to make it opensource. I was following the instructions in the README of the torch2trt folder, and the current visualization code is giving the wrong input into show_results.

Essentially, the xywh is in normalized form and incorrectly shifted, which reduce to all zeros when converted to integers for cv2.rectangle. It can be fixed by replacing this snippet in img_vis:

xywh = (xyxy2xywh(det[j, :4].view(1, 4)) / gn).view(-1).tolist()
conf = det[j, 4].cpu().numpy()
landmarks = (det[j, 5:15].view(1, 10) / gn_lks).view(-1).tolist()
class_num = det[j, 15].cpu().numpy()
orgimg = show_results(orgimg, xywh, conf, landmarks, class_num)

to

conf = det[j, 4].cpu().numpy()
landmarks = (det[j, 5:15].view(1, 10) / gn_lks).view(-1)
landmarks[[0,2,4,6,8]] *= orgimg.shape[1] # x
landmarks[[1,3,5,7,9]] *= orgimg.shape[0] # y
landmarks = landmarks.int().tolist()
class_num = det[j, 15].cpu().numpy()
xyxy = det[j, :4].int().tolist()
EdisionWew commented 2 years ago

@vjsrinivas hi, I found it too. Your code is work. Did you hit that one object must have three box-predict with different area , which just like as below ? image

vjsrinivas commented 2 years ago

I'm sorry. I don't understand the question. Is the code snippet I sent causing the screenshot you posted?

pengfeidip commented 1 year ago

Thanks ! @vjsrinivas