XingangPan / GAN2Shape

Code for GAN2Shape (ICLR2021 oral)
https://arxiv.org/abs/2011.00844
MIT License
573 stars 101 forks source link

The problem of producing same projected samples #21

Open 30qwq opened 3 years ago

30qwq commented 3 years ago

The first (left top) image is the input image and the following ten images are pseudo images and the last ten images are projected images. I wonder that why I got the ten same projected images. The projected images should look like the pseudo images theoretically, but they don't. According the code in model.py, the projected images are produced with the latent code and the offset will affect the latent code. But after training a few loops, the offset will become zero. Then the projected images will all look the same. How to solve this problem? Thanks. image

XingangPan commented 3 years ago

@30qwq This is a little bit odd. Have you revised the code? I notice several differences with my results: 1) The viewpoint changes of pseudo images are too large. I suspect the viewpoint variance has been enlarged. 2) The projected samples look like the 'flip' of the input image. I am wondering if an additional flip operation is accidentally used. I recommend to check about these factors.

30qwq commented 3 years ago

Yes, I've revised the code. I planned to use the projected images as the input data of the model in unsup3d(https://github.com/elliottwu/unsup3d) to enlarge the dataset(The original dataset just has 640 images). The reason of problem 1 you said is I changed the value of "view_scale". I'm still struggling to solve the problem 2 you said. On the other hand, I want to ask why my pseudo images(above) look like 3D object? Are they normal? However, the pseudo images in your paper look flat.

XingangPan commented 3 years ago

@30qwq It is normal that pseudo images look like 3D objects, as they are rendered via 3D mesh renderer. The 3D effects look more obvious in your case because the value of "view_scale" is larger.

30qwq commented 3 years ago

@XingangPan The below image is generated by the function below. I wonder that why the images(the last ten images) generated by gan_invert() don't look similar to the input image(the beginning ten images) of gan_invert(). The gan_invert() function is the same as your code. Thanks for your reply. image

def pseudo_sample(self, save_dir):
  os.makedirs(save_dir+'pseudo2/', exist_ok=True)
  if (self.count%640==0):
    self.count=0
  with torch.no_grad():
      depth, texture, view = self.canon_depth[0,None], self.canon_im[0,None], self.view[0,None]
      num_p, num_y = 3, 7  # number of pitch and yaw angles to sample
      max_y = 90
      maxr = [40, max_y] #[pitch_angle, yaw_angle]

      im_rotate2 = self.renderer.render_view(texture, depth, maxr=maxr, nsample=[num_p,num_y], grid_sample=True)[0]

      gan_rotate, _ = self.gan_invert(im_rotate2, batchify=10)
      output = torch.cat((self.input_im, im_rotate2), 0)
      output = torch.cat((output, gan_rotate), 0)
      fname = str(self.count%640)+'.png'
      filename = os.path.join(save_dir+'pseudo2/', fname)
      save_img(output/2+0.5, filename)
      self.count += 1
      return gan_rotate
XingangPan commented 3 years ago

@30qwq I notice 2 possible reasons: 1) In my original code, the backgrounds of pseudo samples (im_rotate2) should be gray, as the grid_sample mode would pad zero in the background. But in your results they are white. This might cause inconsistency. 2) The rotation range of projected samples (gan_rotate) is constrained by the training data distribution. Thus, it cannot recover faces with large rotation angles. You may reduce the rotation angles of pseudo samples in training and inference.