does ssml language work for multilingual voices?

vortex1024 commented 1 month ago

for the online multilingual voices, microsoft recommends using ssml to force the language when it is not detected right. does your project support that? I could not make this work, for example, with the brian multilingual voice selected in the tts application supplied in the packge, with process xml on:

it speaks French, not Romanian. I also tried enclosing the text in a element, setting its lang attribute, but no go. Thanks.

gexgd0419 commented 1 month ago

Unfortunately, Microsoft Edge online voices only support a very limited subset of SSML. <lang> tags are not supported.

Also, any unsupported SSML tag will make the server throw an "SSML is invalid" error and close the connection. So this engine has to filter out all SSML tags except a few supported ones, such as <prosody>, before sending the SSML to the Edge voice server.

The Edge voice server requires an xml:lang attribute on the root <speak> element. But changing it seems to do nothing.

So no, changing the language is not supported by Edge voices.

But if you have an Azure Speech subscription key, you can use the Azure voices, which supports that feature.

Currently this engine does not enumerate Azure voices, so if you want to use an Azure voice, you will have to add it manually to the registry.

In registry editor, create a registry key under HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Speech\Voices\Tokens. Then, create the following keys & values inside this key:

String value (Default): display name of the voice, e.g. Microsoft BrianMultilingual
String value CLSID: {013ab33b-ad1a-401c-8bee-f6e2b046a94e}
Subkey: Attributes
- String value Language: hexadecimal language ID of the voice, e.g. 409 for English (US)
Subkey: NaturalVoiceConfig
- String value Region: service region, e.g. japaneast
- String value Key: your subscription key
- String value Voice: voice name, e.g. en-US-BrianMultilingualNeural

Check this for a list of Edge online voice names ("ShortName").

vortex1024 commented 1 month ago

thanks for the detailed explanation. it is a shame this does not work. the only way I imagine it could be made to work for free is always passing in some unique string to that language, and then cutting the correspondent audio from the resulting wav

gexgd0419 / NaturalVoiceSAPIAdapter

does ssml language work for multilingual voices? #2