lifeiteng / vall-e

PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Reproduced Demo https://lifeiteng.github.io/valle/index.html
https://lifeiteng.github.io/valle/index.html
Apache License 2.0
1.99k stars 320 forks source link

How to train a mixed dataset? #150

Closed tom1997 closed 1 year ago

tom1997 commented 1 year ago

I have different dataset manifests, and I join them with tokenizer.py. and I wanna know is unique_text_tokens.k2symbols would affect the training? cuz the index is not the same

lifeiteng commented 1 year ago

All your need is merging the symbols before training.