Audio format in dataset files

Plachtaa / FAcodec

Training code for FAcodec presented in NaturalSpeech3

178 stars 18 forks source link

Audio format in dataset files #17

Open r666ay opened 3 months ago

r666ay commented 3 months ago

Thanks for you great work on implementing FACodec! I found the data file in https://github.com/Plachtaa/FAcodec/blob/master/data/val.txt has some labels, like speaker id, phonemes. How can I get these labels? Will these labels be auto-generated in the training process?

Plachtaa commented 3 months ago

It was from VCTK dataset for legacy implementation. For the current version in this repo, annotation is not required. Auto-generated labels will not be saved during training process

r666ay commented 3 months ago

It was from VCTK dataset for legacy implementation. For the current version in this repo, annotation is not required. Auto-generated labels will not be saved during training process

Thanks for your reply. What models are used to generate these annotations? I want to export the auto-generated labels.