The multilingual model cannot correctly distinguish between Chinese and Japanese (Text of Date)

connermo commented 6 months ago

Description

Run official sample python script to synthesize Chinese Date text, using the text-to-speech API with voice name "zh-CN-XiaoxiaoMultilingualNeural". However, the output is Japanese instead of Chinese.

Steps to Reproduce

Run the official sample python script as follows:

'''
  For more samples please visit https://github.com/Azure-Samples/cognitive-services-speech-sdk 
'''

import azure.cognitiveservices.speech as speechsdk

# Creates an instance of a speech config with specified subscription key and service region.
speech_key = "xxx"
service_region = "xxx"

speech_config = speechsdk.SpeechConfig(subscription=speech_key, region=service_region)
# Note: the voice setting will not overwrite the voice element in input SSML.
speech_config.speech_synthesis_voice_name = "zh-CN-XiaoxiaoMultilingualNeural"
speech_config.speech_synthesis_language = "zh-CN"
speech_config.speech_recognition_language = "zh-CN"

text = "2024年9月24日"

# use the default speaker as audio output.
speech_synthesizer = speechsdk.SpeechSynthesizer(speech_config=speech_config)

result = speech_synthesizer.speak_text_async(text).get()
# Check result
if result.reason == speechsdk.ResultReason.SynthesizingAudioCompleted:
    print("Speech synthesized for text [{}]".format(text))
elif result.reason == speechsdk.ResultReason.Canceled:
    cancellation_details = result.cancellation_details
    print("Speech synthesis canceled: {}".format(cancellation_details.reason))
    if cancellation_details.reason == speechsdk.CancellationReason.Error:
        print("Error details: {}".format(cancellation_details.error_details))

Kerry-LinZhang commented 6 months ago

Hi @connermo would you please try adding lang tag: https://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-synthesis-markup-voice#adjust-speaking-languages

Or if there is issue with the Date reading, you can also add say-as date tag for a quick fix: https://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-synthesis-markup-pronunciation#say-as-element

pankopon commented 6 months ago

Closed since no further updates, presumed resolved with the configuration instructions given.

Azure-Samples / cognitive-services-speech-sdk

The multilingual model cannot correctly distinguish between Chinese and Japanese (Text of Date) #2357

Description

Steps to Reproduce