Rudrabha / Wav2Lip

This repository contains the codes of "A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild", published at ACM Multimedia 2020. For HD commercial model, please try out Sync Labs
https://synclabs.so
10.18k stars 2.19k forks source link

Confusion about the training lip-sync Expert Discriminator part of the code #357

Open xiao-keeplearning opened 2 years ago

xiao-keeplearning commented 2 years ago

Thanks your great work. I'm a little confused about the code in the training lip-sync Expert Discriminator section of your project. https://github.com/Rudrabha/Wav2Lip/blob/b9759a3467cb1b7519f1a3b91f5a84cb4bc1ae4a/color_syncnet_train.py#L69-L71

Why is a random number assigned to the variable idx here, which is counterintuitive

Mayur28 commented 2 years ago

Hi,

This is just to ensure that we pick a random video from the training set as opposed to processing videos sequentially in the order that they occur in the dataset. You may also be confused, thinking why do we select a random index here, and also when we create the dataloader, we set shuffle = True. The result for this lies in the fact that this line that you are referring to is contained in a while loop. This means that if it picks a random video and there happens to be some issue with it (such as containing few than 15 frames) or an error occurred when loading the image or audio window, this will cause us to loop again, picking another random video until we do not encounter any issues. Hope that makes sense.