Zhengxinyang / SDF-StyleGAN

MIT License
120 stars 12 forks source link

3D GAN Inversion #5

Closed bluestyle97 closed 1 year ago

bluestyle97 commented 1 year ago

Hi, thanks for your excellent work! I notice that you conduct experiments on GAN inversion in the paper. Could you please share the code for SDF-StyleGAN inversion? I believe this would benefit more applications.

Zhengxinyang commented 1 year ago

The code for SDF-StyleGAN inversion cubes is not quite good enough to share, but let me give you the code at the heart of it. My hope is this should be enough for you to be able to write your own function with minimal effort.

For point cloud inversion

def optimize_code(pc, z_code, model, lr=1e-3, step=20):
  opt = Adam([z_code], lr=lr, betas=(0.9, 0.999))
  for i in range(step):
    sdf = model.forward_with_code_and_pc(
        z_code.unsqueeze(0), pc, space="Z", return_gradient=False)
    loss = loss_function(sdf)
    opt.zero_grad()
    loss.backward()
    opt.step()
  return z_code

For SVR, you can train a regression model like

def forward(self, batch):
    img = batch['img'].cuda()
    code = self.model(img)
    batch_size = img.shape[0]
    pts = self.pts.expand(batch_size, -1, -1)
    sdf = self.gan.forward_with_code_and_pc(
        code, pts, space=self.space, return_gradient=False)
    sdf_gt = batch['sdf'].cuda()
    loss = torch.mean((sdf.view(-1) - sdf_gt.view(-1)) ** 2)
    return loss
bluestyle97 commented 1 year ago

What about SDF grid inversion? Is it similar to point cloud inversion?

Zhengxinyang commented 1 year ago

When you set pc to points in the grid, it's the same as the point cloud inversion

bluestyle97 commented 1 year ago

I want to invert the sdf grids in the training dataset into latent codes. I implement a simple optimization-based inversion pipeline which optimizes the latent code (in w space, and initialized as the mean latent code) directly instead of using an encoder. The loss is computed as F.l1_loss(sdf_gen, sdf_gt), in which sdf_gen and sdf_gt represent the GAN-generated sdf grid and the ground truth sdf grid in the training dataset respectively (both in 128^3 resolution). However, I find the loss does not decrease and the optimization does not converge. I have tried adjusting the learning rate but it is no use. Do you have some suggestions on this? Will the encoder-based inversion eliminate this problem?

Zhengxinyang commented 1 year ago
  1. You can try torch.optim.lr_scheduler.CosineAnnealingLR to jump out of local minima
  2. In our experience, the optimization is sensitive to the initial value, and using an encoder to get a better initial value helps a lot