Azure-Samples / cognitive-services-speech-sdk

Sample code for the Microsoft Cognitive Services Speech SDK
MIT License
2.94k stars 1.86k forks source link

The multilingual model cannot correctly distinguish between Chinese and Japanese (Text of Date) #2357

Closed connermo closed 6 months ago

connermo commented 6 months ago

Description

Run official sample python script to synthesize Chinese Date text, using the text-to-speech API with voice name "zh-CN-XiaoxiaoMultilingualNeural". However, the output is Japanese instead of Chinese.

Steps to Reproduce

Run the official sample python script as follows:

'''
  For more samples please visit https://github.com/Azure-Samples/cognitive-services-speech-sdk 
'''

import azure.cognitiveservices.speech as speechsdk

# Creates an instance of a speech config with specified subscription key and service region.
speech_key = "xxx"
service_region = "xxx"

speech_config = speechsdk.SpeechConfig(subscription=speech_key, region=service_region)
# Note: the voice setting will not overwrite the voice element in input SSML.
speech_config.speech_synthesis_voice_name = "zh-CN-XiaoxiaoMultilingualNeural"
speech_config.speech_synthesis_language = "zh-CN"
speech_config.speech_recognition_language = "zh-CN"

text = "2024年9月24日"

# use the default speaker as audio output.
speech_synthesizer = speechsdk.SpeechSynthesizer(speech_config=speech_config)

result = speech_synthesizer.speak_text_async(text).get()
# Check result
if result.reason == speechsdk.ResultReason.SynthesizingAudioCompleted:
    print("Speech synthesized for text [{}]".format(text))
elif result.reason == speechsdk.ResultReason.Canceled:
    cancellation_details = result.cancellation_details
    print("Speech synthesis canceled: {}".format(cancellation_details.reason))
    if cancellation_details.reason == speechsdk.CancellationReason.Error:
        print("Error details: {}".format(cancellation_details.error_details))
Kerry-LinZhang commented 6 months ago

Hi @connermo would you please try adding lang tag: https://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-synthesis-markup-voice#adjust-speaking-languages

Or if there is issue with the Date reading, you can also add say-as date tag for a quick fix: https://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-synthesis-markup-pronunciation#say-as-element

pankopon commented 6 months ago

Closed since no further updates, presumed resolved with the configuration instructions given.