stefantaubert / tacotron

Command-line interface to train Tacotron 2 using .wav <=> .TextGrid pairs.
MIT License
5 stars 2 forks source link

Nice work! I have a question about duration marker. #4

Open MaxMax2016 opened 9 months ago

MaxMax2016 commented 9 months ago

for train, we can get duration marker as

Phoneme duration marker:

˘ -> [0, 20) percentile
ˑ -> [80, 90) percentile
ː -> [90, inf) percentile

but, when inference, how can we get duration marker? Or it is just used in train stage?

stefantaubert commented 9 months ago

Depending on the model you can use the associated pronunciation dictionary to get the transcriptions for the inference (duration markers are supported in training and inference), e.g.: