Multi-face detection - Githubissues

yinguobing / facial-landmark-detection-hrnet

A TensorFlow implementation of HRNet for facial landmark detection.

GNU General Public License v3.0

157 stars 40 forks source link

Multi-face detection #17

Open soans1994 opened 2 years ago

soans1994 commented 2 years ago

hello author,

Does this repo has multi-face keyoint detection support? I want to ask wheter the COCO person dataset is specialized for multi-person? i have trained a simple model using FCN for face keypoint regression. Can you please tell me how we can get heatmaps of muliple faces in the bottom-up approach. My current model takes input of 96x96x1 image and gives 96x96x15 size heatmaps for 15 keypoints. I trained my model using kaggle datset consisting of images with single face pre cropped to 96x96. Do i need the dataset with multiple faces? and do I need bouding box information or mask information too?

Please give me your advice thank you

yinguobing commented 2 years ago

I have no experiences in this kind of project but, you might want to checkout this post: https://medium.com/neuromation-blog/neuronuggets-understanding-human-poses-in-real-time-b73cb74b3818

soans1994 commented 2 years ago

thank you. yes i have to use pose estimation methods for detecting multi faces. I want to ask how to train images of 96x96 faces and heatmap labels on 256x256 input network size. Currently I use 96x96 image and its heatmap labelsl with same size 96x96x1 network input and it can predict only 1 face of output size 96x96x15, where 15 maps are 15 keypoints. In your case the input image is larger than the face size. How can i do it?

please help me

yinguobing commented 2 years ago

I guess this training process is similar to body key points detection, and this repo do not support that kind of training. The training images in this repo contains only a single sample, and the input image is naturally down sampled by the conv layers; while a bottom up approach may consists multiple. They are so different.

soans1994 commented 2 years ago

oh i see, so you use the face bonding box information from your dataset and crop and feed if for training? all iamges are same size after crop? and what about the network input size and output size?

thank you

soans1994 commented 2 years ago

you use some face detector to get the boudning box, crop and then feed to predcition network right? thats why you can use any resolution image while testing.

yinguobing commented 2 years ago

Exactly! You can find some details in the preprocessing module: https://github.com/yinguobing/facial-landmark-detection-hrnet/blob/master/preprocessing.py