[Bug]: speakSsmlAsync produces 0 duration audio but result reason is SynthesizingAudioCompleted

What happened?

Hi, We have found some scenarios when using speakSsmlAsync where we receive a reason of ResultReason.SynthesizingAudioCompleted (10) but an audioDuration of 0 and no errorDetails. Based on the documentation we would have expected to receive a reason code of ResultReason.NoMatch (0) or to have received errorDetails if the text could not be processed. We would like guidance on what type of scenarios we might expect to get 0 audioDuration with a response result that is ResultReason.SynthesizingAudioCompleted so that we can handle in our code accordingly.

One of the scenarios we found where this happens is if the text sent is simply punctuation marks (which made sense), but we found scenarios where text was sent and the same result happened which is why we are looking for guidance on when we should expect this type of response so our code handling can be updated accordingly.

We are reliably able to reproduce this scenario as below when converting Japanese text; when we submitted the equivalent of the Japanese word “test” 49 times it produced an audio result but if we submitted the equivalent of the Japanese word “test” 50 times it produced audioDuration 0. Obviously, this is just a test data scenario, but what we are looking for is guidance on known scenarios to expect a audioDuration result of 0 since we weren’t able to understand why it might work for 49 repeats but not 50 repeats of the same text.

In both cases we are working out of region uswest2 and voice of ja-JP-NanamiNeural:

For テストテストテストテストテストテストテストテストテストテストテストテストテストテストテストテストテストテストテストテストテストテストテストテストテストテストテストテストテストテストテストテストテストテストテストテストテストテストテストテストテストテストテストテストテストテストテストテストテストテストテストテストテストテストテストテスト (repeats テスト 49 times) we get:

result: SpeechSynthesisResult {
        privResultId: '4BF7C5A4B23C4C6897989758D562E37C',
        privReason: 10,
        privErrorDetails: undefined,
        privProperties: undefined,
        privAudioData: ArrayBuffer {
          [Uint8Contents]: <52 49 46 46 f4 b8 08 00 57 41 56 45 66 6d 74 20 10 00 00 00 01 00 01 00 80 3e 00 00 00 7d 00 00 02 00 10 00 64 61 74 61 d0 b8 08 00 fe ff fd ff fd ff fc ff fd ff fd ff fc ff fd ff fd ff fd ff fd ff fc ff fc ff fc ff fc ff fb ff fc ff fc ff fc ff fc ff fc ff fc ff fc ff fc ff fd ff fc ff fd ff fc ff ... 571544 more bytes>,
          byteLength: 571644
        },
        privAudioDuration: 178625000
      }

result: SpeechSynthesisResult {
        privResultId: '4BC6BDA9AD0346FBA5FA5DF09E078335',
        privReason: 10,
        privErrorDetails: undefined,
        privProperties: undefined,
        privAudioData: ArrayBuffer {
          [Uint8Contents]: <52 49 46 46 24 00 00 00 57 41 56 45 66 6d 74 20 10 00 00 00 01 00 01 00 80 3e 00 00 00 7d 00 00 02 00 10 00 64 61 74 61 00 00 00 00>,
          byteLength: 44
        },
        privAudioDuration: 0
      }

Note that we are on version 1.36.0. Your latest releases indicate latest is 1.35.0 and the latest version here that I am able to choose when filing this bug report is 1.34so that's what I selected, but on npm the latest is 1.36.0 published 20 days ago which is what we are on.

Version

1.34.0 (Latest)

What browser/platform are you seeing the problem on?

Node

Relevant log output

No response

microsoft / cognitive-services-speech-sdk-js