vincent-thevenin / Realistic-Neural-Talking-Head-Models

My implementation of Few-Shot Adversarial Learning of Realistic Neural Talking Head Models (Egor Zakharov et al.).
GNU General Public License v3.0
828 stars 195 forks source link

about training data samplers #42

Closed maliho0803 closed 4 years ago

maliho0803 commented 4 years ago

as original paper saied 'The generator G(yi(t); e^i; ; P) takes the landmark image yi(t) for the video frame not seen by the embedder,the predicted video embedding e^i and outputs a synthesized video frame x^i(t). ', however, in your dataloader part, the image used in generator was appeared in embedder?

vincent-thevenin commented 4 years ago

The dataloader loads all the data for the embedder, generator and discrimninator. In the training loop each network receives its proper input. e.g. x_hat = G(g_y, e_hat)

Jarvisss commented 4 years ago

@vincent-thevenin @maliho0803 In the original paper, section 3.2, it is said the input to the generator is not used for the embedder

image

But it seems that, in your dataloader, the input to the embedder is K images and K landmarks, while the input landmark to the generator is randomly picked from the K samples.