What is the role of "tone" in the return value of "text_to_sequence"?

anonymous-pits / pits

PITS: Variational Pitch Inference for End-to-end Pitch-controllable TTS without External Pitch Predictor

https://anonymous-pits.github.io/pits/

MIT License

274 stars 34 forks source link

What is the role of "tone" in the return value of "text_to_sequence"? #9

Closed isletennos closed 1 year ago

isletennos commented 1 year ago

Hello. In the implementation, the "tone" is embedded and added by "TextEncoder". This was not the case in the VITS implementation. What is the intention of this implementation? Thank you in advance.

anonymous-pits commented 1 year ago

Hi isletennos! It is from our group's VITS implementation for languages that contain tones, such as Chinese and Mandarin Chinese.

Unfortunately, we are unable to release other language options due to several reasons.

Thank you for your interest!

isletennos commented 1 year ago

Okay, I understand. Thank you.