barisgecer / GANFit

Project Page of 'GANFIT: Generative Adversarial Network Fitting for High Fidelity 3D Face Reconstruction' [CVPR2019]
http://openaccess.thecvf.com/content_CVPR_2019/html/Gecer_GANFIT_Generative_Adversarial_Network_Fitting_for_High_Fidelity_3D_Face_CVPR_2019_paper.html
GNU General Public License v3.0
631 stars 64 forks source link

About the Texture GAN #9

Closed shoukna closed 4 years ago

shoukna commented 4 years ago

Hi @barisgecer, when I am reproducing the texture GAN, I encountered some problems.

The texture GAN's input is a 512-dimension latent vectors, and the arcface's output is also a 512-dimension, does this mean that the output of the Arcface is the input of the GAN? And if this output is the p_t parameter?

What's more, when have trained a texture GAN, the output should be a [512, 512, 3] uvmap, but in Figure 2 of the paper, the output's height and width is not be equal, so what's the output of the texture GAN? And how can the output uvmaps correspond to the 3D vertices so that the colors can be reprojected?

Looking forward to your reply. Thank you so much!

VectXmy commented 4 years ago

I am also reproducing GANFIT, I think all params are optimized through end-to-end.

shoukna commented 4 years ago

I am also reproducing GANFIT, I think all params are optimized through end-to-end.

Thanks, I get it. Have you reproduced the part of texture GAN? @VectXmy

VectXmy commented 4 years ago

It's just a PgGan. The key is that a large amount of high-definition texture data is difficult to obtain.

shoukna commented 4 years ago

Yes, I have found some uv maps and trained the GAN, but do you know how to use the results of GAN(uvmap) to correspond the 3D vertices? I don't find this solution.

What's more, the paper says 'render a secondary image with random expression, pose and illumination', and 'sample carema and illumination params from Gaussian distribution of 300W-3D dataset', do you get these params of 300W-3D dataset?

shoukna commented 4 years ago

@VectXmy Could you please tell me if you use tensorflow/pytorch or just numpy to reproduce this paper?

VectXmy commented 4 years ago

I just did some experiments by using pytorch. Lack of data is a big problem.

shoukna commented 4 years ago

I try to use the texture of BFM to implement the whole process first, but when I get the loss, I can't not optimize these params through BP, have you tried to do this part? @VectXmy

barisgecer commented 4 years ago

Sorry for my late reply, here are some answers:

The texture GAN's input is a 512-dimension latent vectors, and the arcface's output is also a 512-dimension, does this mean that the output of the Arcface is the input of the GAN? And if this output is the p_t parameter?

No, it is just a coincidence that both have the same size vectors. The output Arcface for the target image and the rendered reconstruction are compared and the difference is minimized. The input of GAN is a Gaussian noise (that is to be optimized during reconstruction)

What's more, when have trained a texture GAN, the output should be a [512, 512, 3] uvmap, but in Figure 2 of the paper, the output's height and width is not be equal, so what's the output of the texture GAN? And how can the output uvmaps correspond to the 3D vertices so that the colors can be reprojected?

You are right, the UV topology that I used is not square, I managed to make it square by cropping and zero-padding. After reconstruction, I am reversing this preprocessing. That is why you see this weird coloring for ears. In the ideal case, you should have a square UV maps.

As @VectXmy mentioned, the main problem is data, unfortunately, I am not authorized to share the data due to license issues. But you can train a GAN with other datasets or have a look at our other project TBGAN for pretrained model release (we plan to release soon).

Yes, I have found some uv maps and trained the GAN, but do you know how to use the results of GAN(uvmap) to correspond the 3D vertices? I don't find this solution.

Correspondence should be done by texture coordinates (t_coords) which is pre-defined within the topology.

What's more, the paper says 'render a secondary image with random expression, pose and illumination', and 'sample carema and illumination params from Gaussian distribution of 300W-3D dataset', do you get these params of 300W-3D dataset?

As far as I remember 300W-3D contains those parameters in the ground truth.

Let me know if you have any further questions.

shoukna commented 4 years ago

Thank you very much for your reply, it helps me solve many problems.

But I still have two questions. In Fig2, the 3D projection onto the 2D image has only the face and no ears. Have you reconstructed the ears?

In addition, I have used BFM to implement the entire process, but it takes a long time to reconstruct a single image, you mentioned in the paper 'The Fitting converges in around 30 seconds on an Nvidia GTX 1080 TI GPU for a single image’, is there any other acceleration?

Best regards.

barisgecer commented 4 years ago

But I still have two questions. In Fig2, the 3D projection onto the 2D image has only the face and no ears. Have you reconstructed the ears?

Ears are ignored during optimization due to misalignment problems. Also LSFM does not ears properly. So we chopped the ear off during the optimization. The resulting reconstruction gives a suitable ear thanks to PCA.

In addition, I have used BFM to implement the entire process, but it takes a long time to reconstruct a single image, you mentioned in the paper 'The Fitting converges in around 30 seconds on an Nvidia GTX 1080 TI GPU for a single image’, is there any other acceleration?

The time should not change by face model. The code needs to be optimized properly. Are you using tf_mesh_renderer? Did you implement every operation in tensorflow? In what resolution are you rendering? How many iterations? Is the reconstruction going better by time? How long does it take for you and are you satisfied with the reconstruction?

shoukna commented 4 years ago

Yes, I use tf_mesh_renderer, but I directly use the texture in BFM instead of using PGGAN, and the operations related to a secondary image are not implemented. As time changes, the reconstruction results will first get better and then worse. The best results will be achieved in about a few hundred iterations, about 3-5 minutes. I think it may be a problem in my implementation process. I will try to check it later.

In addition, in the 300W-3D dataset, I find some coefficients, but only a part of Illum_Para and Pose_Para, there is no direct lighting position and camera parameters, could you please tell me the specific parameters you used directly?

Many thanks!

barisgecer commented 4 years ago

I think you are on the right path. It is expected that the reconstruction overfits after some iterations. That is why a secondary pose is useful. You may need to optimize your code, try to find the bottlenecks, and parallelize as much as possible. Keep the rendering resolution low and other code optimizations.

300W-3D camera and illumination parameters might be slightly in a different format, e.g. Polar coordinates vs, cartesian coordinates. I don't remember exactly but I had to do some conversion. You can easily figure it out by looking at the renderings of a few samples from 300W-3D.

shoukna commented 4 years ago

Okay, thank you very much for solving so many problems for me, and thank you for making such a great work, I hope I can successfully reproduce it.

Best regards.

hanyanghong86 commented 3 years ago

Yes, I have found some uv maps and trained the GAN, but do you know how to use the results of GAN(uvmap) to correspond the 3D vertices? I don't find this solution.

What's more, the paper says 'render a secondary image with random expression, pose and illumination', and 'sample carema and illumination params from Gaussian distribution of 300W-3D dataset', do you get these params of 300W-3D dataset?

Hi, I want to ask you about how to define the texture coordinate to correspond the 3D vertices with uv texture map, do you know any learning materials about this knowledge point.

RodneyPerhaps commented 3 years ago

@shoukna hello, shoukna! Have you succeeded to reproduce the GANFIT? Can you share the code in a repository? I would be very appreciated!

VectXmy commented 3 years ago

I implemented GANFIT by using pytorch. (link)

shoukna commented 3 years ago

@RodneyPerhaps Sorry, I have not successfully reproduced it and I haven't done this for a long time,