Rudrabha / LipGAN

This repository contains the codes for LipGAN. LipGAN was published as a part of the paper titled "Towards Automatic Face-to-Face Translation".
http://cvit.iiit.ac.in/research/projects/cvit-projects/facetoface-translation
MIT License
578 stars 122 forks source link

Sample Dataset for usage #2 #20

Closed shikhar-scs closed 4 years ago

shikhar-scs commented 4 years ago

Hi @Rudrabha , Amazing work. I was working on generating video from image+audio and it would be very helpful if you could post a sample image and audio file. I've been getting different errors every time I'm using a random image.

Thanks!

prajwalkr commented 4 years ago

Thank you very much for your interest.

Could you please some of the errors? If it is a mistake in the code, then we can fix it for everyone.

shikhar-scs commented 4 years ago

Hey, I did a setup of the same on local(mac) and it worked fine.

On remote (gpus) there was the following tensorflow error

image

prajwalkr commented 4 years ago

The input size must be 96x96. Please ensure this.

If it doesn't get solved with this, please report the tensorflow and keras version you have.

prajwalkr commented 4 years ago

The input size must be 96x96. Please ensure this.

Have updated the repo code as well to not receive img_size as a variable input parameter.

shikhar-scs commented 4 years ago

Oh okay, thanks for the help !

shikhar-scs commented 4 years ago

hey @prajwalkr , another thing.

For training the model could you please specify the dataset structure, would be helpful for a lot of people, the way its mentioned here https://github.com/Hangz-nju-cuhk/Talking-Face-Generation-DAVS#preparing-training-data

prajwalkr commented 4 years ago

Done.

shikhar-scs commented 4 years ago

Hi @Rudrabha finally completed the end to end setup and more importantly everything is working now. Just a few nit bugs in preprocess.py. Pointing out so that future users don't have to spend extra time.

The split in line 116 & 118 should probably be args.split https://github.com/Rudrabha/LipGAN/blob/16ca935aa0689723fbf4bae9d85ce061b925c2cc/preprocess.py#L116-L118

Also, sr here, the sampling rate, is undefined again. Is there any specific value to be used ? (as of now I've set it to None, to preserve the native sampling rate of the file). https://github.com/Rudrabha/LipGAN/blob/16ca935aa0689723fbf4bae9d85ce061b925c2cc/preprocess.py#L87

Will let you know if I come across any others. Thanks again for the work!

prajwalkr commented 4 years ago

Please use sr=16000, I will correct these two in preprocess.py right away. Thank you very much.