Questions about training datasets

kpzhang93 / MTCNN_face_detection_alignment

Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Neural Networks

MIT License

2.8k stars 1.01k forks source link

Questions about training datasets #10

Open sparrow0629 opened 7 years ago

sparrow0629 commented 7 years ago

Hello, thanks for sharing your code. I have some questions about training.

which dataset did you use for training?
how did you select positive samples? for example the Widerface dataset is labeled by rectangular, but in MTCNN, the bounding box should be square, how you adjust the groundtruth? Thank you

cdicle commented 7 years ago

Hi @sparrow0629,

Unfortunately, we haven't heard from @kpzhang93 about training much. I will try to answer your questions to the best of my knowledge and my experience with playing code.

He trains the networks both on WIDER and AFLW. WIDER is a nice large dataset good for face detection and AFLW has landmarks (WIDER does not).

The inputs must be square. In the training time I believe he is turning the rectangle to square, maybe smallest enclosing square. He is not distorting/rescaling the rectangle region to a square one.

A box is a true box if it has intersection over union ratio larger than 0.5. See III.A. in the paper for details.

Best, Cha.

sparrow0629 commented 7 years ago

Thank you so much @cdicle really helps me a lot

daikankan commented 6 years ago

Hi, @cdicle I'm also trying the training process of mtcnn, I guess there are many details and tricks that I don't know, expecially for landmark localization regression in multi-task training, it's really hard for me to find out, but it's really a fantastic projects, this is my trial: github.com/daikankan/mtcnn, hard to reach the author's precision ~~

herleeyandi commented 6 years ago

I think we can't reach the author. Until now to be honest I still want to know how they train it.

geoffzhang commented 6 years ago

@cdicle why is used the pnet to generate train datas for rnet? Whether can I use the train datas used in pnet to train rnet?