Open MrBeandou opened 6 years ago
in the 3.2 ,the paper says that "if k!=k' then the generated image does not belong in the same cluster with the two images used to produce it and is therefore rejected ",but in the bottom of 3.1 it says that "this process treats horizontally flipped versions of the same face differently -by different tweaked process",if it should be rejected ,why put it in different tweaked process?
in 3.2 it says that Oversampling and mirroring both introduce misaligned images into each cluster, increase landmark position variability and so undermine the goal of our fine-tuning" what dose it mean by saying " increase landmark position variability and so undermine the goal of our fine-tuning".
I also have some puzzle about the implementation of the paper ,1st, should I train the vanilla cnn the loss stop to decrease ,then I can do the tweaking?
if I extract the fc5 feature to do EM,is that mean I just fine-tune fc6 layer and fix the parameter of fc5 and all above layers?
in the 3.1 tweaking by fine-tune, the paper say this process treats horizontally flipped versions of the same face differently, is that mean for the horizontally flipped image, I get the result in the different tweaking branch, I should horizontally flipped back the landmark point , and do the average with the original image's landmark point?
the argumentation in the 3.2,it mentioned about the non-reflective similarity transform H,can you show the math expression of it?
sorry to ask your so many questions ,and I thank you a lot!
"“here early stopping is used to fine-tune each sub-network”" is needed due to the very low number of images in each cluster.
"why not can I just use its own images to fine-tune its own-network ,but put it as sub-network?" the amount of images in clusters can only allow you to finetune the last fully connected per cluster.
"if k!=k' then the generated image does not belong in the same cluster with the two images used to produce it and is therefore rejected" after augmentation, if the image does identify in expected index, do not use it as it might be corrupted.
"if I extract the fc5 feature to do EM is that mean I just fine-tune fc6" - yes "layer and fix the parameter of fc5 and all above layers" - not sure what you mean.
Will try to follow on up other questions next time slot.
I just don't make you understand what I say "early stopping", I want to know whether I should calculate all the cluster loss together when one of loss stop decreasing and should be dropped or I can just calculate one cluster loss ,when it is finished, I calculate another?
and I want to know why I can't do oversampling such as rotate, flip in the tweaking step?
I am confused about some information about 3.1 Tweaking by fine-tune and 3.2 Alignment-sensitive data augmentation. In 3.1 ,the paper says "we fine-tune the remaining weights from fc5 to the output,separately for each cluster using only its images". I put it as I just fine-tune the different cluster using its own images, but why saying “here early stopping is used to fine-tune each sub-network”. If I just have the images of its own cluster ,why not can I just use its own images to fine-tune its own-network ,but put it as sub-network? Second, if it is put as sub-network ,did it mean that it should calculate all the cluster loss together?