grey-eye / talking-heads

Our implementation of "Few-Shot Adversarial Learning of Realistic Neural Talking Head Models" (Egor Zakharov et al.)
GNU General Public License v3.0
593 stars 110 forks source link

VGG_Face weights #51

Open phquanta opened 3 years ago

phquanta commented 3 years ago

I've been investigating a lot of controversies going over Caffe <-> Torch conversions for weights and i've noticed one thing is that original VGG_Face is trained in mtncovnet framework and consumes plain RGB images uint8 and (I have it confirmed here too https://github.com/albanie/pytorch-benchmarks/blob/master/lfw_eval.py) and i assume that CAFFE versions and TORCH versions are exact replicas of those weights. My question is that your mean and std conversion, is it applicable to VGG_face ?

I was also reading Jarviss's (https://github.com/vincent-thevenin/Realistic-Neural-Talking-Head-Models/issues/12) comment and forked version of the code, and i've noticed that lossG is quite high and lossD is zero immediately which i dont understand at the moment, whereas training goes (slow, very slow).