initial data script - Githubissues

TMElyralab / MuseTalk

MuseTalk: Real-Time High Quality Lip Synchorization with Latent Space Inpainting

Other

1.84k stars 219 forks source link

initial data script #85

Open shounakb1 opened 1 month ago

shounakb1 commented 1 month ago

I have made the data creation script, this is a dirty implementation though. The way I make it work is by running:

First update the test.yaml with the train video and corresponding audio. Then run:

python -m scripts.data --inference_config configs/inference/test.yaml --folder_name train

Again update the test.yaml with test video and audio and again running:

python -m scripts.data --inference_config configs/inference/test.yaml --folder_name test

This creates folders which contain the image frames and npy files.

I also need to modify the Dataloader.py in train_codes to make the training hapen. I can make the necessary changes before you merge @czk32611 @itechmusic .

https://github.com/TMElyralab/MuseTalk/assets/32771603/e3bba83f-93c3-4908-9213-9da596c39c49

Also attaching a sample to show the trained results, I cant upload more than 10MB so its small.

Just guide me on what changes to make and I'll do it. Thanks for this guys.

czk32611 commented 1 month ago

Can you also show how should we modify Dataloader.py?

shounakb1 commented 1 month ago

Hi @czk32611 , @itechmusic , I have commited the changes needed to make training and inference(with finetuned model) work. Please check the README.md in train_codes. Let me know anything which needs to be changed.

shounakb1 commented 1 month ago

@czk32611 , @itechmusic There is no way to compare validation loss with the training loss right now in the train.py script. Do you think we should add that? I can add it if needed.

paulovasconcellos-hotmart commented 3 weeks ago

Can I ask what data you trained your model on?

shounakb1 commented 3 weeks ago

Can I ask what data you trained your model on?

The training data was youtube videos

paulovasconcellos-hotmart commented 3 weeks ago

Thanks for the quick reply! How many hours did your dataset have?

shounakb1 commented 3 weeks ago

Just 15 mins fo this one, I trined on 1 hr data for better results

paulovasconcellos-hotmart commented 3 weeks ago

@shounakb1 I'm trying to use your code to train on multiple videos. When you have multiple files, the code won't work. Especially in this part, where the code will get a reference image. If you have more videos in the folder, It might get a reference frame from another video.

czk32611 commented 3 weeks ago

@shounakb1 I'm trying to use your code to train on multiple videos. When you have multiple files, the code won't work. Especially in this part, where the code will get a reference image. If you have more videos in the folder, It might get a reference frame from another video.

@shounakb1 paulovasconcellos-hotmart is right, data from each video should be saved to individual folder. Could you modify this part?

shounakb1 commented 3 weeks ago

@shounakb1 I'm trying to use your code to train on multiple videos. When you have multiple files, the code won't work. Especially in this part, where the code will get a reference image. If you have more videos in the folder, It might get a reference frame from another video.

@shounakb1 I'm trying to use your code to train on multiple videos. When you have multiple files, the code won't work. Especially in this part, where the code will get a reference image. If you have more videos in the folder, It might get a reference frame from another video.

@shounakb1 paulovasconcellos-hotmart is right, data from each video should be saved to individual folder. Could you modify this part?

Yes thanks for pointing it out guys, I'll fix it, sorry for the delay, work schedule is a bit hectic right now.

shounakb1 commented 2 weeks ago

@czk32611 I've fixed the multiple video data preperation probem. Please let me know if anything else needs to be changed.