Open anh opened 3 years ago
Hi,
the whitespace collapse is a wanted effect, mostly to be able to control where the pauses are allocated with the forward model. You can remove this if you want by removing it from line 91 in data/text/tokenizer.py (return the line above). But I would discourage that, unless you're running into problems.
For the numbers issue, you can add the missing phonemes (for instance 1,2,3,4,5,,6,7,8,9,0) in data/text/symbols.py in all phonemes like so:
all_phonemes = sorted(list(_phonemes) + list(_punctuations) + list('1234567890')
I was not aware that some languages had numbers as phonemes.
TODO: Add optional extra phonemes string to data_config.yaml
Thank you for your clarification and making phonemes configurable is super helpful. I'll try your suggestion.
When using phonemizer (espeak-ng) there are digits to reflex the vowel/sound variants like the following:
output:
with
tokenizer._postprocess
:output:
Outputs placed together:
My question is the missing of numbers (here 7, 1) and spaces surround punctuation like comma as in
zˈaː,tɕˈuɜŋ tˈaː
instead ofzˈaː7 , tɕˈuɜŋ t̪ˈaː1
will affect the aligment and pause beetween generated words?