harlanhong / CVPR2022-DaGAN

Official code for CVPR2022 paper: Depth-Aware Generative Adversarial Network for Talking Head Video Generation
https://harlanhong.github.io/publications/dagan.html
Other
958 stars 125 forks source link

Depth map from paper not reproducable #20

Closed mrokuss closed 2 years ago

mrokuss commented 2 years ago

Hi

Firstly, thank you for this awesome work. However, I tried to reproduce the depth map from the paper using the "demo.py" script and the result is quite different from the one seen in Fig. 9 of the paper.

Result from the paper: depthmapDaGAN paper

Result running the script: depthmapDaGAN_myRun

Corresponding depth map as pointcloud: depthmapDaGAN_myRunPCD

The Depth map looks way more smooth and facial details like the nose or mouth are completely lost.

harlanhong commented 2 years ago

Hi @mrokuss,

A more detailed depth map can improve the performance. The released depth model is the one that was only trained on VoxCeleb1 in our experiments, and the result of Figure 9 is from the model trained on Voxceleb2. It is trained in the Rebuttal phase. Due to the rules of camera-ready submission, we did not make major changes to the paper, but only added Fig.9 to prove that our depth network can be trained to generate better and more detailed depth maps with a large amount of data. We haven't used this depth model to train our DaGAN yet. I will first release this depth model first and then release the DaGAN checkpoint trained with this depth model.

NikitaKononov commented 2 years ago

I will first release this depth model first and then release the DaGAN checkpoint trained with this depth model.

Hello) Would be very cool, if you release model with 512x512 input/output size in the future You have cool GPU resources important for training, modest students like me don't have such opportunity :( It will be available to use your model like an high quality talking head-avatar generation for educational platforms for example. Would be very cool I think, thanks a lot for your work!