MicrosoftDocs / azure-docs

Open source documentation of Microsoft Azure
https://docs.microsoft.com/azure
Creative Commons Attribution 4.0 International
10.25k stars 21.41k forks source link

Encoding parameter is necessary for Spanish and other languages #97922

Closed almudenaftourne closed 2 years ago

almudenaftourne commented 2 years ago

While testing the docs code with Spanish I noticed encoding is necessary for voices to pronounce certain letters correctly (like 'ñ', which is very common in the Spanish language). If not, it misspells the word completely.

There is a note on the "Use SSML to customize speech characteristics" section where it is explained how to read using SSML configuration, but it only mentions encoding for this specific case:

_If your ssmlstring contains  at the beginning of the string, you need to strip off the BOM format or the service will return an error. You do this by setting the encoding parameter as follows: open("ssml.xml", "r", encoding="utf-8-sig").

I was able to fix the Spanish pronuntiation by adding the encoding specification:

ssml_string = open("ssml-spanish-v2.xml", "r", encoding='utf-8').read()
result = synthesizer.speak_ssml_async(ssml_string).get()

I think it would be helpful to clarify this in the docs as most languages have specific characteris like 'ñ' or "¨" that otherwise won't be pronounced correctly.


Document Details

Do not edit this section. It is required for docs.microsoft.com ➟ GitHub issue linking.

YutongTie-MSFT commented 2 years ago

Thanks for the feedback! We are currently investigating and will update you shortly.

YutongTie-MSFT commented 2 years ago

@almudenaftourne Thanks a lot for pointing this out, I have reached out to the author for making it clear.

eric-urban commented 2 years ago

@almudenaftourne - Thank you for taking the time to point this out! I've opened an internal work item to track this to completion. We can add more details to the SSML document and/or update the current how to synthesize page.

eric-urban commented 2 years ago

please-close

yulin-li commented 1 year ago

Hi @almudenaftourne , what's the encoding of your ssml-spanish-v2.xml file? Could you share this file with us for further investigating?