watson-developer-cloud / swift-sdk

:iphone: The Watson Swift SDK enables developers to quickly add Watson Cognitive Computing services to their Swift applications.
https://watson-developer-cloud.github.io/swift-sdk/
Apache License 2.0
877 stars 223 forks source link

[text to speech] Random service failures #582

Closed triceam closed 7 years ago

triceam commented 7 years ago

I have an iOS app that is leveraging text to speech to provide audible response to user input. Sometimes the service works fine, other times it fails (two different errors). Most of the time, it is within the exact same session within the app. I am using the sample code here, with very few (if any) modifications: https://github.com/watson-developer-cloud/swift-sdk#text-to-speech

The errors I am seeing:

  1. Only the gb_Kate voice works. All tests that I have done using other voices don't produce any audio. I used textToSpeech.synthesize(text, voice: SynthesisVoice.gb_Kate.rawValue, audioFormat:AudioFormat.l16 ,failure: failure) { data in exactly, and only swapped gb_Kate with us_Allison and us_Lisa, but neither appears to work.
  2. With the gb_Kate voice, sometimes the SDK/service works fine, other times the sound does not play and I see noData in the Xcode console.
  3. Sometimes the sound does not play and I see Error Domain=com.ibm.watson.developer-cloud.TextToSpeechV1 Code=401 "Not Authorized" in the Xcode console.

Between items 2 and 3 above, they happens within the same app session with no code changes.

This is happening in response to a service query. I have tried calling speech to text and playing the synthesized audio in the main thread, but it doesn't seem to make a difference.

triceam commented 7 years ago

Update: I have tried with pretty much every sound format option, and the only one that seems to be partially working is wav, but that is still inconsistent.

schen22 commented 7 years ago

Resolved due to entering invalid credentials.

This issue points to a known problem where the 401 Not Authorized response isn't returned until the third or fourth request to the Text to Speech service, leading to unnecessary confusion on the users' part. We'll need to continue following up with the text to speech team to understand why this is occurring.