barisgecer / GANFit

Project Page of 'GANFIT: Generative Adversarial Network Fitting for High Fidelity 3D Face Reconstruction' [CVPR2019]
http://openaccess.thecvf.com/content_CVPR_2019/html/Gecer_GANFIT_Generative_Adversarial_Network_Fitting_for_High_Fidelity_3D_Face_CVPR_2019_paper.html
GNU General Public License v3.0
631 stars 64 forks source link

About How to iteratively update the input vector Pt of PGGAN #13

Closed Hpjhpjhs closed 2 years ago

Hpjhpjhs commented 3 years ago

Thanks for your great works. I have some issues about the input parameter Pt of PGGAN. Whether we need to roughly update the input parameter Pt through minimizing per-pixel Manhattan distance between UV texture of the input 2D image and the output UV texture of PGGAN firstly, then, adjust Pt through the loss of model fitting. In short, do we need a two-step adjustment of Pt. It is appreciated if I can receive your reply.

barisgecer commented 3 years ago

We do not have UV texture of the input image, how can we do that?

No, it is not a two-step adjustment. Although you can find a better initialization point to start optimization (such as https://arxiv.org/abs/2105.07474), only fitting by the rendered textured trimesh (by its distance to the input image) should be sufficient.

Regards, Baris

Hpjhpjhs commented 3 years ago

@barisgecer Thanks for your reply. Actually, initially, I used 5000 UV textures to train PGGAN, and then fixed the trained PGGAN as a texture generator, and optimized the Pt vector of the PGGAN according to the Manhattan distance between the rendered image and the input image. The purpose of this is to independently verify the effectiveness of PGGAN. Especially when I finish the fitting process, the output UV texture of PGGAN is different from the identity of the input image like below: image The first row is the input images and generated UV texture by PCA method. The second row is the generaed UV texturex by pretrained PGGANs. Besides, if I am just to test the fitting ability of PGGAN, like inversion of GAN, the results are as follows: image The left image is original UV texture, the right one is the generated UV texture by pretrained PGGAN. Is my fitting method correct? Could you provide a more detailed texture fitting process?

Besides, "fitting with a generator network can be formulated as an optimization that minimizes per-pixel Manhattan distance between target texture in UV space Iuv and the network output G(pt) with respect to the latent parameter pt" in GANFIT paper, what's the 'target texture in UV space', the input real image or not? Can I say that the distance is between the input image and the rendered image corresponding to G(pt)?

Thank you for your great work again. It is looking forward to receiving your reply.

barisgecer commented 3 years ago

Well, your approach is completely different than ours. You assume that you can extract texture UV from the input image and do the fitting on that image (why do you need to do texture fitting if you have the texture in the first place?). However, we believe that extracting pixels from the input image and complete missing parts actually would contain a lot of illumination from the scene. GANFit does the fitting after rendering with a proper lighting model to disentangle illumination and texture. Since we have a texture model that is quite high resolution and with the same sort of illumination (which we call albedo, in-fact it has some highlights), we can then completely remove the lighting by methods such as AvatarMe. If we had mix of lighting AvatarMe wouldn't be that successful.

I believe you should first understand that our fitting approach is based on rendering the texture and comparing it directly with the input image by a combination of loss functions which includes identity loss as well (please see the paper). Whereas yours assume to have the target texture and calculate a primative pixel-to-pixel Manhattan distance which does not care about the identity. Therefore it cannot have global affect.