microsoft / cognitive-services-speech-sdk-js

Microsoft Azure Cognitive Services Speech SDK for JavaScript
Other
256 stars 92 forks source link

[Bug]: Speech Recognition result cannot always be converted to Pronunciation Assessment Result #770

Open coreyward opened 7 months ago

coreyward commented 7 months ago

What happened?

I ran into an issue where an audio recording sent didn't have any detectable speech in it. I would expect some kind of error message to come back, but what I did not expect is that the SDK would throw an error over it when used as shown in examples.

Cannot read properties of undefined (reading '0')

This is coming from line 108 here: https://github.com/microsoft/cognitive-services-speech-sdk-js/blob/b062648064d9b353b08ec67af0e41f48a9bea4da/src/sdk/PronunciationAssessmentResult.ts#L103-L110

It seems like the types are inaccurate for j.NBest. When the Speech Recognizer is handed an audio file that does not have any words in it, the result comes back without any errors, but also rather sparse without many fields including NBest. For example:

SpeechRecognitionResult {
      privResultId: '75EDA7E5421A411F81E0D64B9504D75C',
      privReason: 0,
      privText: undefined,
      privDuration: 49200000,
      privOffset: 0,
      privLanguage: undefined,
      privLanguageDetectionConfidence: undefined,
      privErrorDetails: undefined,
      privJson: '{"Id":"0ed4402056c8472a99bc19fff317a024","RecognitionStatus":2,"Offset":0,"Duration":49200000,"Channel":0,"SNR":0}',
      privProperties: PropertyCollection {
        privKeys: [ 'SpeechServiceResponse_JsonResult' ],
        privValues: [
          '{"Id":"0ed4402056c8472a99bc19fff317a024","RecognitionStatus":"InitialSilenceTimeout","Offset":0,"Duration":49200000,"Channel":0,"SNR":0.0}'
        ]
      },
      privSpeakerId: undefined
    }

It would be really useful if this behavior (and overall, all of the potential response formats from the API) were documented somewhere accurately. It's hard to build robust applications that fail gracefully when the behavior of dependencies is undocumented.

Using v1.33.1, but the "Version" dropdown in the issue form only lists up to 1.33.0.

Version

1.33.0 (Latest)

What browser/platform are you seeing the problem on?

No response

Relevant log output

No response

glharper commented 6 months ago

@coreyward Thanks for submitting this issue and using JS Speech SDK. In theory, an InitialSilenceTimeout result should be shunted down a different codepath before attempting to create a PronunciationAssessmentResult, so this does feel like a bug.