Confusion about the training lip-sync Expert Discriminator part of the code

Rudrabha / Wav2Lip

This repository contains the codes of "A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild", published at ACM Multimedia 2020. For HD commercial model, please try out Sync Labs

10.18k stars 2.19k forks source link

Hi,

This is just to ensure that we pick a random video from the training set as opposed to processing videos sequentially in the order that they occur in the dataset. You may also be confused, thinking why do we select a random index here, and also when we create the dataloader, we set shuffle = True. The result for this lies in the fact that this line that you are referring to is contained in a while loop. This means that if it picks a random video and there happens to be some issue with it (such as containing few than 15 frames) or an error occurred when loading the image or audio window, this will cause us to loop again, picking another random video until we do not encounter any issues. Hope that makes sense.

Rudrabha / Wav2Lip

Confusion about the training lip-sync Expert Discriminator part of the code #357