MarekKowalski / DeepAlignmentNetwork

A deep neural network for face alignment
MIT License
510 stars 136 forks source link

what are the roles of heatmap and feat map? especially feat map? #10

Open lx120 opened 7 years ago

lx120 commented 7 years ago

Hi,when I read your paper,I’m confused about the roles of the heatmap and featmap. What is the motivation to use them?

MarekKowalski commented 7 years ago

Hi,

The heatmap is a visual representation of the landmark locations estimated by the previous stage. Thanks to the heatmap the next stage can "understand" what the current estimates of the landmark locations are and thus it can estimate an update to those locations.

The featmap is a dense layer connected to the penultimate layer of the previous stage. Its goal is to transfer information learned by the previous stage to the next stage in a visual manner. While this has not been verified we believe it can contain information such as head pose or facial expression, which can be useful to the next stage.

Let me know if you have more questions.

Marek

dedoogong commented 7 years ago

if both maps are for delivering information of the previous stage, do they need to have backward/gradient pass(they should be learnable)? if not, I just need to implement the 5 custom layers with caffe using only Forward_cpu/gpu method(I'm trying it).

MarekKowalski commented 7 years ago

Hi,

The feature image layer is essentially just a dense layer that is later reshaped, it is definitely learnable and requires a backward pass. The landmark image layer does not have learnable parameters, it is a direct result of the landmarks output by the previous stage. If you plan to train the stages separately (as we did in the article), the forward pass should be enough, since the preceeding layers and input are "constant". Having said that, doesn't caffe calculate the gradients/backward pass by itself?

Marek

lx120 commented 7 years ago

Thanks for your reply. And I have another question. When I use your function "bestFit(destination, source, returnTransform=False)" to do my own landmarks(roll angle is 20) align to front meanshape in MATLB, I found the result is not good.

Could you give a link about inference process to the formula?

MarekKowalski commented 7 years ago

Hi,

Do the destination and source landmarks sets use the same number of landmarks? Are the landmarks in the same order? The method is described here in Appendix D of "Introuction to Active Shape Models", an early article on face alignment, you can view it here: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.141.3089&rep=rep1&type=pdf

Marek

lx120 commented 6 years ago

Hi Marek:

Some detailed data process is confused me,could you explain it? (1) when augmenting train data, I read the code is , using groundtruth landmarks generate face bbox and using bestFitRect() to get initial landmarks. But, valid data are using detected face bbox to generate initial landmarks. Which method did you using to get face bbox of valid data,VJ、dlib? I using same method to get train data and using MTCNN face bbox to get valid data, however, the loss of valid data is always high, could you tell me some detail about it. (2) when I read your demo code CameraDemo.py, I found your initial landmarks is "self.initLandmarks = f["initLandmarks"] ", but not do any process to best fit the input face(is I didn't found the detial code to do these?)

Thnks for your share!

luoxu

MarekKowalski commented 6 years ago

Hi,

1) The face detection bounding boxes are provided in the training set (300-W dataset). 2) I don't really understand this question, could you elaborate?

Best regards,

Marek

lx120 commented 6 years ago

Hi Marek:

I am sorry to didn't express my question clearly.

  1. Yes, the face detection bounding boxes are provided in the training set (300-W dataset) in your code. And, I want to know which face detection methon to get these bounding boxes. Bacause I using mtcnn detect face bounding boxes to generate valid set, train set produce same as your code,then training net,but loss of valid set is not good. And my data set is 300w-lp.

  2. In your code CameraDemo.py. I found your initial landmarks mean shape dose not do resize and centralized to face bounding box. May be I didn't found the peocess code. So I want to sure didn't the code do it? Or, just read initial landmarks S_initial from file and directly regress the residual landmarks add to the S_initial,output result as detected landmarks

MarekKowalski commented 6 years ago

Hi,

Thanks for the clarification, here are my answers: 1) I do not know what face detector the authors of the dataset used. All they specify is that "The face region that our detector was trained on is defined by the bounding box as computed by the landmark annotations". This basically means that it is a tight bounding box around the landmark locations. Look at the dataset page to see an example: https://ibug.doc.ic.ac.uk/resources/300-W/

2) The utils.bestFitRect function scales and moves the mean shape so that it is roughly aligned with the landmarks from the previous frame. Is that what you had in mind?

Marek

lx120 commented 6 years ago

Hi Marek:

Yes! I think , before detect face landmarks, that the code shoud do utils.bestFitRect operation for initial landmarks. But I didn't found the operation to change the value of self.initLandmarks in class FaceAlignment. Maybe it is my carelessness, I will check again. Thanks!

Luo xu

MarekKowalski commented 6 years ago

Hi,

The value os self.initLandmarks never changes, it is always the mean face shape located in the middle of a 112x112 image.

Marek