harlanhong / CVPR2022-DaGAN

Official code for CVPR2022 paper: Depth-Aware Generative Adversarial Network for Talking Head Video Generation
https://harlanhong.github.io/publications/dagan.html
Other
958 stars 125 forks source link

Question regarding output of DepthAwareAttention #14

Closed mdv3101 closed 2 years ago

mdv3101 commented 2 years ago

In the DepthAwareAttention module, the inputs are: depth_image and output feature map generated by occlusion map line 195.

depth_image is stored in 'source' while output feature is stored in 'feat'.

There is one variable gamma line 66, which is basically a zero tensor. self.gamma = nn.Parameter(torch.zeros(1))

After doing all the operations in forward pass, you are getting an output feature map. It is then multiplied with gamma and feat is added line 87. out = self.gamma*out + feat

That means all the operations performed during the forward pass are multiplied to zero and the original output features were returned. That makes the entire DepthAwareAttention useless, as the attention returned was also never used in the code.

Can you please clarify on this?

harlanhong commented 2 years ago

Hi @mdv3101 ,

The "self.gamma" is a trainable parameter that serves as the input gate for the attention result. It similars to Highway networks. Our network can learn a proper value for gamma after training to control the input of the attention result.

mdv3101 commented 2 years ago

Hi @harlanhong , Thanks for clarifying.