MycroftAI / mimic2

Text to Speech engine based on the Tacotron architecture, initially implemented by Keith Ito.
Apache License 2.0
580 stars 103 forks source link

Pronunciation might depend on word order #39

Closed ChanceNCounter closed 5 years ago

ChanceNCounter commented 5 years ago

I've been experimenting with the American male voice, and I've noticed a couple of instances where word order affects pronunciation.

First of all, it can pronounce the word "code," but when I asked it to say, "arbitrary code," it talked about arbitrary cod.

Also, "Oregon." I tried to nail down the correct pronunciation, and found that:

"Ora-ggn" sounds right, and "Ora-ggn State" sounds best, but "State of Ora-ggn" catches on the second syllable.

However, "Ora-gnn" sounds okay, "Ora-gnn State" sounds all wrong, and "State of Ora-gnn" sounds just right!

el-tocino commented 5 years ago

The models aren't built on individual words (for the most part) so they look at the structure of the sentence to generate the results. You're finding the results of how that can change things.

On local instances, it may help to add a comma or period into the request in spots, or even alter the spelling of certain words you want it to say.

ChanceNCounter commented 5 years ago

I should probably close this, but it might be helpful to start assembling common solutions someplace. "Tips for writing dialog" or something.