microsoft / cognitive-services-speech-sdk-js

Microsoft Azure Cognitive Services Speech SDK for JavaScript
Other
265 stars 98 forks source link

iOS Safari: STT Error. #320

Closed BryanDollery closed 3 years ago

BryanDollery commented 3 years ago

I've written a react spa pwa targetting web, android, and iOS. My app works without a problem on the first two platforms, but it doesn't work on a fully updated iPhone 11. The error message is:

Unhandled Rejection (TypeError): null is not an object (evaluating 'this.privInitializeDeferral.reject')
(anonymous function)
src/common.browser/MicAudioSource.ts:141
  138 |             this.privInitializeDeferral.resolve();
  139 |         }, (error: MediaStreamError) => {
  140 |             const errorMsg = `Error occurred during microphone initialization: ${error}`;
> 141 |             this.privInitializeDeferral.reject(errorMsg);
      | ^
  142 |             this.onEvent(new AudioSourceErrorEvent(this.privId, errorMsg));
  143 |         });
  144 | }

This is a short commercial project for a major European MS customer, we're a partner, and delivery is due in a few days. I could take this through formal channels but I thought that it might be quicker to report the issue here. I'm running v1.15.0 of the sdk.

I have written this up in more detail on StackOverflow:

https://stackoverflow.com/questions/66028639/microsoft-cognitiveservices-speech-sdk-javascript-on-ios-safari-micaudiosource

TIA

glharper commented 3 years ago

@BryanDollery We don't officially support React integrations like this, but can you reproduce this using version 1.14.1? This may be a regression in 1.15

BryanDollery commented 3 years ago

Hi @glharper, thanks for the response. This problem isn't specific to the nature of the app or React. It's a javascript/safari problem. The react stuff is only about how to structure the javascript and UI. The problem seems to be in the cog-services SDK for javascript. I think it's not waiting for permission to use the microphone before trying to actually use it. One big issue here is that I can't debug a pwa on iOS safari to get a full stack-trace.

glharper commented 3 years ago

@BryanDollery. Understood, but there was a change to the error handling in that specific section of code in MicAudioSource.ts for 1.15. If you could try to repro with 1.14.1, that would at least tell me whether that change is to blame (and possibly give you a workaround until we release a fix).

BryanDollery commented 3 years ago

Ok, tried it with 1.14.1, and the error has changed but it's still basically the same underlying error:

Unhandled Rejection (TypeError): null is not an object (evaluating 'tmp.reject')
(anonymous function)
src/common.browser/MicAudioSource.ts:147
  144 | // without a lot of code replication.
  145 | // TODO: fix promise implementation, allow for a graceful reject chaining.
  146 | this.privInitializeDeferral = null;
> 147 | tmp.reject(errorMsg); // this will bubble up through the whole chain of promises,
      | ^
  148 | // with each new level adding extra "Unhandled callback error" prefix to the error message.
  149 | // The following line is not guaranteed to be executed.
  150 | this.onEvent(new AudioSourceErrorEvent(this.privId, errorMsg));

So the error still relates to grabbing control of the microphone. At least, that's what I'm guessing but it's hard to be sure because of the problems with the error handling code.

The error that's being reported seems to be related to the error handling code itself (I'm guessing that you assume that safari will propagate the correct errors, but isnt' actually doing so), but its the underlying problem that's the real issue. I think that there is a timing issue on OS-X. I have tried grabbing the mic myself, earlier in my app, so that the app already has the necessary permissions before the SDK goes for it, but interestingly that doesn't entirely solve the problem. I don't understand why, but I get a second request for permissions when I start the transcription process.

The behaviour is inconsistent though -- on my last try on an iPad, my pre-register of the mic worked, then I got a 2nd approval request from your sdk, then it worked, but when I opened the app the second time I didn't get asked for permissions at all but recording caused this error.

glharper commented 3 years ago

@BryanDollery Thank you for trying that with the older version and reporting back. I tested on the latest iOS (iPhone XR) Safari using startContinuousRecognition() and got a permission popup, then normal recognition results back.

Does the site you've developed use https? Speech SDK works with https on Safari, but I've noticed that unsecure http on Safari doesn't even present any permission notification for the microphone.

As far as that null error in the Speech SDK code goes, I believe what's happening is turnOff() is getting called while turnOn() is still being processed, which fits the Safari behavior you're describing. I can't fix Safari force turning the mic input off and on, but I can update the code to bubble an error correctly when it does.

BryanDollery commented 3 years ago

Yeah, this is all https. It's a PWA, which only really works with https, and you can't get access to the mic at all with an unencrypted channel for this type of app. It's interesting that you're getting different results to me, but I'm using startContinuousRecognitionAsync(). I wonder if the async part would cause this sort of behaviour? I'll try with your suggested method and see what happens.

Also, yes, an error would be a lot better than an exception, thanks. Perhaps you could call the registered error handler? 😆

BryanDollery commented 3 years ago

Could this issue be related to: https://github.com/microsoft/cognitive-services-speech-sdk-js/issues/238 ?

glharper commented 3 years ago

Yeah, this is all https. It's a PWA, which only really works with https, and you can't get access to the mic at all with an unencrypted channel for this type of app. It's interesting that you're getting different results to me, but I'm using startContinuousRecognitionAsync(). I wonder if the async part would cause this sort of behaviour? I'll try with your suggested method and see what happens.

I miswrote above, my test code is also using startContinuousRecognitionAsync(). In the current state of the SDK API, you don't actually need to await startContinuousRecognitionAsync, as it's not async and doesn't return a promise (this will probably change whenever my team releases v2.0 (which won't be for at least eight months).)

The code my test page uses:

                window.console.log("start recognizing");
                reco.startContinuousRecognitionAsync(()=>{},(err)=>{
                    window.console.log(err);
                    // [...]
                });

Here is the test page I use in full, with 1.15 libs included: all.zip

Could this issue be related to: #238 ?

It could, does your code also use other libraries that use the Audio Context API?

BryanDollery commented 3 years ago

Ah, I see. Thanks for the clarification. Yeah, I realised that the function didn't return a promise, so I used this structure instead:

    recognizer.recognized = recognized;
    recognizer.recognizing = recognizing;
    recognizer.canceled = handleError;

    try {
      recognizer.startContinuousRecognitionAsync();
    } catch (err) {
      console.log(`error found with startContinuousRecognitionAsync ${JSON.stringify(err, null, 4)}`);
    }

Where recognized and recognizing are functions that deal with the partial and full transcriptions.

I don't use any other audio-context code at all, but as an attempt to work around this issue I tried

const um = navigator.mediaDevices.getUserMedia({ audio: true, video: false });

Which results in two separate requests for permissions, so I dropped it.

BryanDollery commented 3 years ago

I've just noticed the 1.15.1 drop with the fix to the turnOn/turnOff error reporting. I'll give it a go and report back.

glharper commented 3 years ago

@BryanDollery 1.15.1 only has a fix for a different regression from 1.14. The turnOn/turnOff fix will be in 1.16, expected towards the end of next month.

glharper commented 3 years ago

@BryanDollery v1.16 has just been released (available here) with this turnOn/turnOff error reporting fix. Thanks again for writing up this issue!

ArunClay commented 2 years ago

Just serve the files via https, it worked for me.