some questions about the paper "Facial Landmark Detection with Tweaked Convolutional Neural Networks"

ishay2b / VanillaCNN

Implementation of the Vanilla CNN described in the paper: Yue Wu and Tal Hassner, "Facial Landmark Detection with Tweaked Convolutional Neural Networks", arXiv preprint arXiv:1511.04031, 12 Nov. 2015. See project page for more information about this project. http://www.openu.ac.il/home/hassner/projects/tcnn_landmarks/ Written by Ishay Tubi : ishay2b [at] gmail [dot] com https://www.l

188 stars 80 forks source link

some questions about the paper "Facial Landmark Detection with Tweaked Convolutional Neural Networks" #18

Open MrBeandou opened 6 years ago

MrBeandou commented 6 years ago

I am confused about some information about 3.1 Tweaking by fine-tune and 3.2 Alignment-sensitive data augmentation. In 3.1 ,the paper says "we fine-tune the remaining weights from fc5 to the output,separately for each cluster using only its images". I put it as I just fine-tune the different cluster using its own images, but why saying “here early stopping is used to fine-tune each sub-network”. If I just have the images of its own cluster ,why not can I just use its own images to fine-tune its own-network ,but put it as sub-network? Second, if it is put as sub-network ,did it mean that it should calculate all the cluster loss together?

MrBeandou commented 6 years ago

in the 3.2 ,the paper says that "if k!=k' then the generated image does not belong in the same cluster with the two images used to produce it and is therefore rejected ",but in the bottom of 3.1 it says that "this process treats horizontally flipped versions of the same face differently -by different tweaked process",if it should be rejected ,why put it in different tweaked process?

MrBeandou commented 6 years ago

in 3.2 it says that Oversampling and mirroring both introduce misaligned images into each cluster, increase landmark position variability and so undermine the goal of our fine-tuning" what dose it mean by saying " increase landmark position variability and so undermine the goal of our fine-tuning".

MrBeandou commented 6 years ago

I also have some puzzle about the implementation of the paper ,1st, should I train the vanilla cnn the loss stop to decrease ,then I can do the tweaking?

MrBeandou commented 6 years ago

if I extract the fc5 feature to do EM，is that mean I just fine-tune fc6 layer and fix the parameter of fc5 and all above layers?

MrBeandou commented 6 years ago

in the 3.1 tweaking by fine-tune, the paper say this process treats horizontally flipped versions of the same face differently, is that mean for the horizontally flipped image, I get the result in the different tweaking branch, I should horizontally flipped back the landmark point , and do the average with the original image's landmark point?

MrBeandou commented 6 years ago

the argumentation in the 3.2,it mentioned about the non-reflective similarity transform H,can you show the math expression of it?

MrBeandou commented 6 years ago

sorry to ask your so many questions ,and I thank you a lot!

ishay2b commented 6 years ago

"“here early stopping is used to fine-tune each sub-network”" is needed due to the very low number of images in each cluster.
"why not can I just use its own images to fine-tune its own-network ,but put it as sub-network?" the amount of images in clusters can only allow you to finetune the last fully connected per cluster.
"if k!=k' then the generated image does not belong in the same cluster with the two images used to produce it and is therefore rejected" after augmentation, if the image does identify in expected index, do not use it as it might be corrupted.

"if I extract the fc5 feature to do EM is that mean I just fine-tune fc6" - yes "layer and fix the parameter of fc5 and all above layers" - not sure what you mean.

Will try to follow on up other questions next time slot.

MrBeandou commented 6 years ago

I just don't make you understand what I say "early stopping", I want to know whether I should calculate all the cluster loss together when one of loss stop decreasing and should be dropped or I can just calculate one cluster loss ,when it is finished, I calculate another?

MrBeandou commented 6 years ago

and I want to know why I can't do oversampling such as rotate, flip in the tweaking step?