ShiqiYu / libfacedetection.train

The training program for libfacedetection for face detection and 5-landmark detection.
Apache License 2.0
752 stars 208 forks source link

Train on rotated images to improve landmarks. #83

Open mcourteaux opened 1 year ago

mcourteaux commented 1 year ago

The library right now works great for detecting faces, but the landmarks are clearly poorly trained. They are by no means accurate and are often not even close to being in the right spot.

I had a look through the training code, and I didn't see any code performing rotations for augmentation purposes. I was thinking that might be because of the question "How does one rotate a bounding box to generate a new ground truth bounding box label?", but in the end that seems like a very trivial thing to solve, given that faces are mostly ovals within the bounding box. One could rotate the bounding box around the rotated oval.

Anyway, I was wondering if there are plans on training this network architecture with rotations or other types of distortions to improve the landmark positions?

By the way: thanks for this cool work! It's again an impressive achievement of how NNs can be used in performance critical applications, which I greatly appreciate from a research point of view.

CxyZyr commented 9 months ago

The model, with such a small number of parameters, struggles to train the keypoint branch effectively at a 320x320 input size. The fundamental design of this model is primarily for detection, and I see the keypoint branch as an auxiliary component for convergence. To improve this situation, apart from introducing rotation, increasing the input size is essential. This is because faces in the face detection task are often quite small, and many do not provide sufficient information for training keypoints.