Azure-Samples / cognitive-services-speech-sdk

Sample code for the Microsoft Cognitive Services Speech SDK
MIT License
2.83k stars 1.83k forks source link

iOS 16 - SPXSpeechSynthesizer error #1613

Closed danigutierrezayuso closed 1 year ago

danigutierrezayuso commented 2 years ago

When using the speech synthesizer 'speakSsml' function on iOS 16 an error is returned:

[AudioConverter] CompositeAudioConverter.cpp:1082 kAudio_ParamError: packet description 1 of 2: range 5662468096-5662468098, 576 data bytes

We are using the latest 1.23 version of the framework.

This was working fine on iOS 15 and now it's returning 0 bytes of audio.

ralph-msft commented 2 years ago

In order to be able to debug this issue, could you please share the following:

danigutierrezayuso commented 2 years ago

Hi Ralph,

Here is the log swift.log

I guess the key is the error in this line:

[960550]: 16351ms SPX_TRACE_ERROR: synthesizer.cpp:289 ExecuteSynthesis: Codec decoding error: Error decoding audio stream, error code: -50

And this is the configuration that we are using:

let filePath = FileManager.default.urls(for: .documentDirectory, in: .userDomainMask).map(\.path)[0]
let fileName = "/wavefile.wav"
let fileAtPath = filePath + fileName
if !FileManager.default.fileExists(atPath: fileAtPath) {
        FileManager.default.createFile(atPath: fileAtPath, contents: nil, attributes: nil)
}
try! audioConfig = SPXAudioConfiguration(wavFileOutput: fileAtPath)
try! speechConfig = SPXSpeechConfiguration(authorizationToken: azureToken, region: "westeurope")
let synthetizer = try! SPXSpeechSynthesizer(speechConfiguration: speechConfig, audioConfiguration: audioConfig)
let result = try! synthetizer.speakSsml(ssml)

Thank you for helping us!

ralph-msft commented 2 years ago

Thanks for the additional information, we are investigating this issue and will post an update here once we have more information

yulin-li commented 2 years ago

Looks the issue is from audio decoding. The SDK would automatically use the compressed audio format (mp3) when requesting service and decode it using Apple's API to pcm formats. As it broken on iOS 16, there might be some API change in new iOS version.

We will investigate this issue further

yulin-li commented 2 years ago

BTW, could you try to set SpeechServiceConnection_SynthEnableCompressedAudioTransmission to false to disable the compress transmission feature? see here

yulin-li commented 2 years ago

Hi @danigutierrezayuso, I tried to repro this issue but failed. I installed xcode 14 beta with iOS 16 simulator on my macbook and run the quickstart and everything works well in the simulator.

Could you try the quickstart to see if you can repro?

danigutierrezayuso commented 2 years ago

Hi @yulin-li, this is only failing on actual devices. It always works fine on the simulator.

I've tried to set speechServiceConnectionSynthesisEnableCompressedAudioTransmission to false and now it's working perfectly after that change. Thank you!

yulin-li commented 2 years ago

We have fixed this internally and the fix will be released with 1.24.

I'd like to keep this issue open until 1.24 released.

pankopon commented 2 years ago

To be closed when the Speech SDK 1.24.0 release is available (latest estimate by the end of September this year).

tomthecarrot commented 2 years ago

We are also encountering this issue on iOS 16 devices - it results in a consistently-reproducible crash. iOS 15 devices work fine. Additionally, using an 8KHz output format (SpeechSynthesisOutputFormat.Raw8Khz16BitMonoPcm) does work.

Is there an updated ETA on this? Until it is fixed, we cannot ship our update due to crashing on iOS 16.

yulin-li commented 2 years ago

We are also encountering this issue on iOS 16 devices - it results in a consistently-reproducible crash. iOS 15 devices work fine. Additionally, using an 8KHz output format (SpeechSynthesisOutputFormat.Raw8Khz16BitMonoPcm) does work.

Is there an updated ETA on this? Until it is fixed, we cannot ship our update due to crashing on iOS 16.

The ETA is mid-Oct.

yulin-li commented 1 year ago

Closing this issue as 1.24 is released.

guris12 commented 1 year ago

Hi @yulin-li, this is only failing on actual devices. It always works fine on the simulator.

I've tried to set speechServiceConnectionSynthesisEnableCompressedAudioTransmission to false and now it's working perfectly after that change. Thank you!

how can we set this variable to false ?