The released model is trained using character?

jaywalnut310 / vits

VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech

MIT License

6.72k stars 1.23k forks source link

Closed snsun closed 1 year ago

snsun commented 1 year ago

Hi, The paper said the model is trained using IPA sequences. But the released config (and model) seems to use character sequences as input?

snsun commented 1 year ago

Sorry, my mistake. It's IPA model.