Hi, the paper mentioned "We randomly selected 12,000 audio clips for training, 100 for validation, and 500 for testing.", could you provide the filename list of these three sets?
Hello @BakerBunker . I divided the dataset according to the method used in VITS, and I apologize for the miswriting in the preprint version. I also appreciate your pointing it out. :P
Hi, the paper mentioned "We randomly selected 12,000 audio clips for training, 100 for validation, and 500 for testing.", could you provide the filename list of these three sets?