deepcam-cn / yolov5-face

YOLO5Face: Why Reinventing a Face Detector (https://arxiv.org/abs/2105.12931) ECCV Workshops 2022)
GNU General Public License v3.0
2.02k stars 493 forks source link

Multi_label will cause problem in non_max_suppression_face (Now have one solution) #137

Open ChangenicRM opened 2 years ago

ChangenicRM commented 2 years ago

In general.py, when calculating the non_max_suppression_face

Detections matrix nx6 (xyxy, conf, landmarks, cls)

if multi_label: i, j = (x[:, 15:] > conf_thres).nonzero(as_tuple=False).T x = torch.cat((box[i], x[i, j + 15, None], x[:, 5:15], j[:, None].float()), 1) else: # best class only conf, j = x[:, 15:].max(1, keepdim=True) x = torch.cat((box, conf, x[:, 5:15], j.float()), 1)[conf.view(-1) > conf_thres]

if nc>1, then multi_label is true. So when I trained more than 20 epochs then begin to test the model (Need to calculate the NMS), at "x = torch.cat((box[i], x[i, j + 15, None], x[:, 5:15] ,j[:, None].float()), 1)", to concatenate those tensor, the shape of x[:, 5:15] will not satisfy the limit of torch.cat() I guess that because of "i, j = (x[:, 15:] > conf_thres).nonzero(as_tuple=False).T" (To select some tensors which are higher than the conf_thres) lead this condition different from the one-class case. So my solution is to change x[:, 5:15] to x[i, 5:15] to shape the dimensions of those tensors to the same. (maybe this is a little mistake of the author) But I am not sure if it is wrong.

iamdoubleawesome commented 2 years ago

I did the same

litao-zhx commented 2 years ago

你好,请问多目标检测时,需要修改那些地方呢?我在改为检测3类目标时,出现如下错误 image 不知道各位遇到过没?

gatzf commented 1 year ago

Hello, I ran into some problems when modifying the category number of the program. The program could not carry backward, lcls += BCEcls(ps[:, 15:], t), there is something wrong with this statement

Do you have any good suggestions, thank you very much

yychentw commented 7 months ago

你好,请问多目标检测时,需要修改那些地方呢?我在改为检测3类目标时,出现如下错误 image 不知道各位遇到过没?

Hi @litao-zhx , I faced the same issue while training YOLOv5-face for multi-class object detection. The problem arises because the optimizer loads the state_dict from a previous training session on a single-class detector, causing a mismatch in tensor sizes between the optimizer state tensors and the model's parameter tensors.

You can check if you are experiencing a similar situation.