harlanhong / CVPR2022-DaGAN

Official code for CVPR2022 paper: Depth-Aware Generative Adversarial Network for Talking Head Video Generation
https://harlanhong.github.io/publications/dagan.html
Other
969 stars 126 forks source link

The generated face remains the same pose #13

Closed hallwaypzh closed 2 years ago

hallwaypzh commented 2 years ago

Thanks for your good work; however when i tried run the demo, the generated video tends to remains the same pose as the source image; while in the paper (Figure 2) the generated results have driving frame's pose(this is also the case for the results from README), so why is this the case?

https://user-images.githubusercontent.com/29053705/165462856-da97c242-b091-4609-b122-414c4216f492.mp4

harlanhong commented 2 years ago

This is the challenge to be solved. In the beginning, we find the best-aligned frame in the driving video and use it as an anchor, which we denote as A (S for the source image, and the other frames in the driving video are D_i). This problem occurs when anchor A and S are not perfectly aligned. For example, any frame in the driving video you gave is not perfectly aligned with the source image. You can see this problem from Fomm.

hallwaypzh commented 2 years ago

Thanks for your reply, it makes much sense now; but what about the reenactment examples from the paper(Fig 6,7), are those also generated in this way?

harlanhong commented 2 years ago

Thanks for your reply, it makes much sense now; but what about the reenactment examples from the paper(Fig 6,7), are those also generated in this way?

No, the results reported in our paper are using absolute coordinates, since we run our evaluation code in a frame-to-frame manner instead of frame-to-video. But I think It could get the better results in the latter manner.

pfeducode commented 2 years ago

感谢您的出色工作;但是,当我尝试运行演示时,生成的视频往往与源图像保持相同的姿势;而在论文(图 2)中,生成的结果具有驱动框架的姿势(README 中的结果也是如此),那么为什么会这样呢?

结果.mp4

Which data set do you want to select the picture from