Open hdmjdp opened 2 years ago
The Chinese language is composed of syllables, all of which can be modeled with two phonemes, generally called "vowels" and "rhymes". Vowels can be understood as consonants and rhymes as vowels. Based on this, I calculate statistics for the duration information of the two types of phonemes separately.
https://github.com/CMsmartvoice/One-Shot-Voice-Cloning/blob/091bfefaa34427abad0653621d412a1886a71a58/TensorFlowTTS/tensorflow_tts/models/unetts.py#L69
sheng? yun?