Feature Request: Face tranforms (boxes + keypoints)

albumentations-team / albumentations

Fast and flexible image augmentation library. Paper about the library: https://www.mdpi.com/2078-2489/11/2/125

https://albumentations.ai

MIT License

14.26k stars 1.65k forks source link

Feature Request: Face tranforms (boxes + keypoints) #667

Open ternaus opened 4 years ago

ternaus commented 4 years ago

It would be nice to have a transform that can deal with faces.

The face is a set of keypoints or bounding box with a set of keypoints.

As input, it should

take N faces and N*K keypoints. Each face has K keypoints.
symmetric mapping for the transform. I.e. what key points should swap within each face. For example, left eye and right eye.

We want:

Symmetrics flips

When we flip the image, bounding boxes flip, keypoints flip + symmetric key points within each face swap.

Crops

When we crop part of the image and bounding box goes away - all key points in the box should go away.
When part of the box is cropped, key points that got cut should be clipped to the side of the image.

Dipet commented 4 years ago

When part of the box is cropped, key points that got cut should be clipped to the side of the image.

Are you sure that this behavior is correct? For me this behavior is strange on example below.

ternaus commented 4 years ago

Yep. But I expect the behavior of the box being cropped to happen at the side of the image which is fine.

In the middle of the image, it looks strange indeed.

cannguyen275 commented 4 years ago

Any updates for this? This library seems to augment a set of keypoints and bounding boxes separately! But having many tasks both keypoints and bounding boxes have to be augmented together for any single object, etc. RetinaFace, MTCNN The below image shows a sample of landmark and bounding box have to go along with each face: sample

Dipet commented 4 years ago

Hi. We are still thinking about how to implement this functionality using library interface.

tengerye commented 2 years ago

@Dipet Hi, there is another use case in document analysis, where some boxes of tokens will be shrink to a point to represent a special token. Current version thinks such boxes are invalid.

May I ask how is it going now?