Code for "Audio-driven Talking Face Video Generation with Learning-based Personalized Head Pose" (Arxiv 2020) and "Predicting Personalized Head Movement From Short Video and Speech Signal" (TMM 2022)
I followed the Colab tutorial and could get it running. I am now trying to do the same on a custom video and audio. When fine-tuning audio net, we do
!cd Audio/code/; python train_19news_1.py 32 0
When I run this I get
32 lack frame0.mat
32 lack frame1.mat
32 lack frame2.mat
...
...
32 lack frame298.mat
32 lack frame299.mat
not all 300 frames are reconstructed successfully
My video is called 32.mp4 and I used ffmpeg to make sure it is 25 fps. Other than that I haven't modified anything in the notebook. Where is it going wrong?
Also, if anyone was able to reproduce it on a custom video, do share the notebook
I followed the Colab tutorial and could get it running. I am now trying to do the same on a custom video and audio. When fine-tuning audio net, we do
When I run this I get
My video is called 32.mp4 and I used ffmpeg to make sure it is 25 fps. Other than that I haven't modified anything in the notebook. Where is it going wrong?
Also, if anyone was able to reproduce it on a custom video, do share the notebook