acvictor / Obama-Lip-Sync

An implementation of ObamaNet: Photo-realistic lip-sync from text.
MIT License
126 stars 33 forks source link

Mouth is very blurry, what am i doing wrong? #2

Closed dhowe closed 5 years ago

dhowe commented 5 years ago

image

image

acvictor commented 5 years ago

You're not doing anything wrong. The Pix2Pix model here is trained on just 5000 image pairs and the output from it is kinda blurry. To get sharper output, I'd suggest training Pix2Pix on way more images (it takes a huge amount of time and compute).

For the second frame, his head is quite tilted to one side and most of the training image pairs have a more or less straight head - I'm attributing the very noticeable edge blur to that.

dhowe commented 5 years ago

Thanks much. Can you tell me the command you used to create the Pix2Pix model, so that I can train a larger one ?

Also, what is required to use a different audio input file? I passed a new audio file into run.py and ffmpeg, but the output video is now much longer than the new (shorter) sound clip.

acvictor commented 5 years ago

Check out the README of affinelayer's repo.

The difference in video and audio sampling rates may result in this. Trying tweaking the video frame samples at run time.

ak9250 commented 5 years ago

@dhowe were you able to fix this problem?