xinntao / facexlib

FaceXlib aims at providing ready-to-use face-related functions based on current STOA open-source methods.
MIT License
833 stars 146 forks source link

Possible bug in filtering small / angled faces #28

Closed bfreskura closed 1 year ago

bfreskura commented 1 year ago

The code in question is located in face_restoration_helper.py, line 142.

...
for bbox in bboxes:
            # remove faces with too small eye distance: side faces or too small faces
            eye_dist = np.linalg.norm([bbox[6] - bbox[8], bbox[7] - bbox[9]])
            if eye_dist_threshold is not None and (eye_dist < eye_dist_threshold):
                continue
...

If I'm reading the rest of the code right, bbox array will have the following contents:

[bbox_x1, bbox_y1, bbox_x2, bbox_y2, confidence_score, eye_1_x, eye_1_y, eye_2_x, eye_2_y, nose_x, nose_y, lip_1_x, lip_1_y, lip_2_x, lip_2_y]

To calculate the distance between eyes, we should instead use np.linalg.norm([bbox[5] - bbox[7], bbox[6] - bbox[8]]).

Am I missing something?

woctezuma commented 1 year ago

If the content of the bbox array is as you wrote, then I think you are correct. It looks like a confusion between 0-index and 1-index.

https://github.com/xinntao/facexlib/blob/29d792eb2a48e4e87b2b57001a85d2297c439a6f/facexlib/utils/face_restoration_helper.py#L140-L144

bfreskura commented 1 year ago

This is where the bbox array is created. The array ends with landmarks data, and I believe the order of landmarks is standardized as I've written above. I'm not sure how the RetinaFace model was trained, but it's a high probability it follows the standard landmarks order.

https://github.com/xinntao/facexlib/blob/29d792eb2a48e4e87b2b57001a85d2297c439a6f/facexlib/detection/retinaface.py#L213-L233

bfreskura commented 1 year ago

Just checked the repo where RetinaFace was trained and the WIDERFACE dataset it uses has the order of landmarks as above.

To double check, I downloaded the WIDERFACE dataset and tried to plot the labels on images and they match the standard order.

https://github.com/biubug6/Pytorch_Retinaface/blob/b984b4b775b2c4dced95c1eadd195a5c7d32a60b/data/wider_face.py#L57-L66

woctezuma commented 1 year ago

There might be a difference.

In your original post, you listed the features as follows:

bbox_x1
bbox_y1
bbox_x2
bbox_y2
confidence_score <--- 🤔 
eye_1_x
eye_1_y
eye_2_x
eye_2_y
nose_x
nose_y
lip_1_x
lip_1_y
lip_2_x
lip_2_y

This is the code which you linked to from biubug6/Pytorch_Retinaface/blob/master/data/wider_face.py:

annotation = np.zeros((1, 15))
# bbox
annotation[0, 0] = label[0]  # x1
annotation[0, 1] = label[1]  # y1
annotation[0, 2] = label[0] + label[2]  # x2
annotation[0, 3] = label[1] + label[3]  # y2

# landmarks
annotation[0, 4] = label[4]    # l0_x
annotation[0, 5] = label[5]    # l0_y
annotation[0, 6] = label[7]    # l1_x
annotation[0, 7] = label[8]    # l1_y
annotation[0, 8] = label[10]   # l2_x
annotation[0, 9] = label[11]   # l2_y
annotation[0, 10] = label[13]  # l3_x
annotation[0, 11] = label[14]  # l3_y
annotation[0, 12] = label[16]  # l4_x
annotation[0, 13] = label[17]  # l4_y
if (annotation[0, 4]<0):
    annotation[0, 14] = -1
else:
    annotation[0, 14] = 1

I am not sure the confidence score is in the same position, or is there at all. In this case, the code could be correct as it is.

It looks to me that there are (x,y) coordinates everywhere except at the very end of the array. Not sure what annotation[0, 14] which is set to -1 or 1 is for.

Edit: Based on Google's translation of the following issue, the 15th value encodes whether the face is occluded (-1) or not (1).

I am not sure I understand correctly the link to occlusion, but I can see how the check for the x-coordinate of the leftmost eye being negative means that the face is not completely inside the picture, which could be interesting to know for further processing.

bfreskura commented 1 year ago

I've only linked training labels to check if the order of coordinates is eyes -> nose -> lips. It could have been the coordinates were in a different order.

The occlusion score doesn't exist in the bbox array. You can see in the code below that the 5th element is the scores variable, which is in fact the confidence.

https://github.com/xinntao/facexlib/blob/29d792eb2a48e4e87b2b57001a85d2297c439a6f/facexlib/detection/retinaface.py#L226

woctezuma commented 1 year ago

Right. I agree.

I appear as a contributor, because of a pull request once. 😅 Maybe you can submit one too and xinntao may merge it when he notices.

xinntao commented 1 year ago

Thanks @woctezuma and @bfreskura

it had bugs. Now it has been merged. Thanks👍