Rudrabha / Wav2Lip

This repository contains the codes of "A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild", published at ACM Multimedia 2020. For HD commercial model, please try out Sync Labs
https://synclabs.so
10.18k stars 2.19k forks source link

How to Train new datasets #159

Closed gilbertk422 closed 3 years ago

gilbertk422 commented 3 years ago

I'd like to know if there's a detailed docs for training new model from the scratch.

While I am trying to use the GAN model from the repo, it doesn't give me the perfect result yet, so I adjusted some configuration pads etc. But still there are few weird movements around chin and also not working perfectly if the person has beard.

So I'd like to train a new model for research on this field. If you also provide the detailed docs for transforming existing my model to be used in Wav2lip, then it would be nice!

Thanks!

prajwalkr commented 3 years ago

Which dataset do you plan to train on? The documentation is available in the README for training on LRS2.

AlonDan commented 3 years ago

Unfortunately even with a Doc it's not always very clear for everyone to follow.

It will be nice to have a tutorial (or step-by-step Video Tutorial) from somebody who already figure out how to train our own dataset since it's not clear for everyone. especially for non-programmers who like to experiment and share their knowledge with this wonderful community.

If anyone already figure it out and can share it on YouTube to show how it's done + explain what we need, what to type in Anaconda / Windows as well so we can follow and train our own datasets it will be very helpful since more of us will be able to share our own datasets and get more pre-trained models for better results, higher quality and of course, the knowledge of how it's done exactly for non-technical people.

Thanks ahead to anyone who's helping in this. Keep up the good work everyone!

Shriram-Coder99 commented 3 years ago

Which dataset do you plan to train on? The documentation is available in the README for training on LRS2.

How about AVSpeech?

prajwalkr commented 3 years ago

How about AVSpeech?

Create a filelist similar to LRS2, and once you organize your video/folder structure in that manner, it should be possible.

houdajun commented 3 years ago

I have a similar request: we have use the model to inference, the results are pretty good already. However, we would like to do transferred learning based on the trained model, we will train a specific individual, is it possible to do that?

roopeshn28 commented 3 years ago

It would be helpful if the folder structure of LRS2 is shared we can replicate that. Since I'm trying to go with LRS3 I'm finding it a bit tricky in getting the Training started

prajwalkr commented 3 years ago

It would be helpful if the folder structure of LRS2 is shared we can replicate that.

Already there in the README: https://github.com/Rudrabha/Wav2Lip#lrs2-dataset-folder-structure