Closed richardbuckle closed 4 years ago
I've been pondering this problem off and on for some time. EDDI isn't really responsible for this, it's the TTS vendor. But a lot of apps provide ways to tune pronunciation. ReadPlease did it with simple word replacement. Modern TTS support SSML for all sorts of fun things.
Caoimhe O'Shea is only pronounced correctly by Cereproc's Caitlin voice. Every other voice mangles Irish names. I think the solution lies in SSML. https://developer.amazon.com/docs/custom-skills/speech-synthesis-markup-language-ssml-reference.html#phoneme provides a sample of how this can be used. My idea is that a User could have a lookup table. Perhaps I have 'pecan' mapped to "pɪˈkɑːn" and you have it mapped to "ˈpi.kæn".
A UI dialog could have a list of words and some sort of wizard/helper to define the phoneme and test it. Perhaps is could also support super-simple replacements such as "ladyship" to "lady ship".
This is related to #140 and #98. I initially had high hopes for SSML lexicons as the solution for this but found that support for them is very patchy among TTS voices. That part of it needs more thought.
That said, let's not hijack this ticket, which is purely about providing UI for the user to provide an alternative IPA spelling of their commander name, nothing more.
Also, coming up with appropriate, efficient tests to use for phoneme selection would be very difficult without first acquiring much more expertise than we currently possess. 😉
Agree with VB... let's not expand or change the scope of this issue.
...i was discussing UI and UX.
I too would like to see this feature. In my mind, it could work similarly to the phonetic ship name overrides in the Ship Monitor tab. My own CMDR name is AWP3RATOR. I have always thought of and pronounced the 3 as an E: aw-per-a-tor (like operator) but TTS humorously pronounces it as "awp-three-rator".
It's possible to handle this (and many other special cases) via PR #1395 (provided that PR is acceptable).
Yes but there are also performance considerations. There is definite performance merit in keeping particular substitutions confined to their own use cases rather than burdening an ever-growing general lookup with them.
I for one am very leery of pushing everything into a general lookup.
Yes, there are performance considerations. If I had to estimate, all of the current IPA conversions in EDDI probably total a couple hundred. At the extreme if this PR were approved, someone might accumulate several thousand in a general lookup.
Do you expect the performance with the proposed method to differ significantly from the lexicon method that you were looking at previously (and if so then do you have suggestions for improving performance)? How large do you expect the list to grow and how much of a performance gain do you expect by using separate look ups rather than using a unified look up? Are the performance gains from separation worth the added complexity and the reduction in flexibility?
You raise exactly the points that I'd like to see measured. :)
I had favoured writing SSML lexicons on the grounds that TTS voices might use them for internal optimisations, but the patchy support has led me to abandon that path.
Still, direct phonetic overrides for ship name and CMDR name are zero-cost and might as well stay so.
EDDI version in which issue found
3.3.4-b1
Steps to reproduce
Expected
Name pronounced correctly.
Observed
TSS mangles the name.