Open parth-shettiwar opened 2 years ago
Thanks for your interest in our work! What is the learning rate you are using? It might be too low, please try 1e-3 or 1e-4. The whole dataset that includes all emotions is already small, by using only two emotions, I am expecting worse results. However, you should still be able to see some lip motion. You can try using only reconstruction loss to see if lips are moving.
Also, please check the dataset before training. You can animate the images with audio to see if there are any issues with data preparation.
Best, Emre
Hi, We have been trying to run the code. The pretrained model which you have provided works perfectly fine, however when we train the model from scratch, the generated output is always same frame in the whole video with audio running in background and no changes in facial or lip features. What could be the potential reasons for this observation? Was such an observation noticed by you while doing training?
The following are the changes we did to the code: 1) We have been trying to train this model on only 2 emotions: Happy and Sad. Rest all emotions are removed when creating the dataset. Also we selected only subset of dataset for these 2 emotions (around 500 videos) 2) Pretraining of discriminator and generator for 5 epochs and performed the joint training for 7 epochs
Is this due to incorrect dataset preparation or absence of other emotions (like Neutral face) or incomplete training (very less epochs) or any other reason?
Thanks in advance