Blade6570 / icface

ICface: Interpretable and Controllable Face Reenactment Using GANs
Other
157 stars 26 forks source link

Question about training Neutral Image #16

Open Hsintien-Ng opened 4 years ago

Hsintien-Ng commented 4 years ago

I am interested in your recent work, icface, which has shown great performance on face manipulation. I want to train your framework with my datasets but cannot find the training file in your released code. So I reproduced the training code with implementation details described in your paper. However, I have met such a problem that the neutral face images become all zero when training the network with the Reconstruction loss because second item |GN(LS)-GA(LS)| makes Neutralizer(GN) to output images of zero values or constant values (no face in generated images).

Could you please show me more details about training your proposed framework. Do you use multi-stage training or adopt other constraints so that the generated images GN(LS) are not all 0. In addition, I have added the LightCNN to compare the features of GN(LS) and source image but it still fails to generate face images after GN.

I would be appreciated a lot if you provide me the training code or more details to solve my problems.

Blade6570 commented 4 years ago

Hi, I am not sure when I can upload the training script because could not get time to clean it and post it. It is kind of a code dump. If you can email me then I am happy to provide you the training code. Regarding the problem you mentioned about the training of G_N, I can add a few points. I have attached the loss/backward path calculation block for G_N. G_A training is straight forward so I am skipping it. Please have a look at the comments of the code block. I hope that fixes the problem with the G_N.

def backward_GR(self):

   # self.fake_B: generated neutral image
   # self.real_B: final Ground Truth
   # self.tar = pseudo GT for neutral image generated by G_A
   # self.netG = G_A
   #self.Neut = neutral Action unit [0.5,0.5,0.5, 0,0,....0]

   pred_fake, E_con = self.netDA(self.fake_B) # This is the discriminator for G_N, that predicts real/fake and AUs

   self.loss_G_GANE = self.criterionGAN(pred_fake, True)  # GAN loss

   self.sal_loss2= (self.criterionL1(E_con,self.Neut))*self.opt.lambda_A  #AU regression loss

   with torch.no_grad():

      self.tar=self.netG(torch.cat([self.fake_B,self.AUN],dim=1)) 
           **# Here self.netG is the G_A in the equaltion.I calculate the ground truth for the neutral image as self.tar. Note that the 
              gradient is not passing to the G_A. Otherwise G_A will know what G_N is doing and can be able to generates 
              everything by itself making neutral image blank.**

   self.recon=self.criterionL1(self.fake_B,self.tar.detach())*self.opt.lambda_B  # calulated |G_N(L_s)-G_A(L_s)|

   self.fake_B_gray = self.fake_B[:,0,:,:] * 0.299 + self.fake_B[:,1,:,:] * 0.587 + self.fake_B[:,2,:,:] * 0.114  # grey scale image for LightCNN

   self.recon_Light=self.criterionLight(self.fake_B_gray.unsqueeze(1),self.real_A_gray.detach())*0.5   # LightCNN loss

   AUR = self.param_B.view(self.param_B.size(0), self.param_B.size(1), 1, 1).expand(
     self.param_B.size(0), self.param_B.size(1), 128, 128)

   self.fake_B_re=self.netG(torch.cat([self.fake_B,AUR],dim=1))  **# Again passing neutral image to G_A with final AUs to generate the reenacted output. (which is the actual GT)**
   self.R=self.criterionL1(self.fake_B_re,self.real_B.detach())*self.opt.lambda_B   # calculate the recon loss
   self.loss_GR= self.loss_G_GANE + self.sal_loss2 +self.recon + self.recon_Light +self.R

   self.loss_GR.backward()`
Hsintien-Ng commented 4 years ago

Hey,thank you for answering my doubts. I still have some questions about the training details. By the way, I did send you an email to the email address provided in the paper but got no reply. Here is my email (kegrant@163.com). I would be appreciated a lot if you send me the training code.

Hsintien-Ng commented 3 years ago

Could you explain the training scheme of G_N and G_A? I try your code combined with G_A training. It fixed the blank problem but G_A and G_N output the same images as the input.

Do you optimize the G_N and G_A together or seperately?