microsoft / cognitive-services-speech-sdk-js

Microsoft Azure Cognitive Services Speech SDK for JavaScript
Other
252 stars 91 forks source link

[Bug]: Websocket 404 in Firefox #814

Closed claptimes5 closed 2 months ago

claptimes5 commented 2 months ago

What happened?

I've been using the real-time speech to text API for over a year, and recently (in the last few days) I've seen issues using the API in Firefox and Safari. Chrome/Edge work fine.

I was able to reproduce this issue using the sample code. It works fine with Chrome, but on Firefox, I see a 404 when trying to establish the connection.

I've tested disabling Enhanced Tracking Protection, but have the same issue. image

I'm using Firefox 124.0.2. Windows 11.

Version

1.36.0 (Latest)

What browser/platform are you seeing the problem on?

Firefox, Safari

Relevant log output

No response

glharper commented 2 months ago

@claptimes5 Thank you for using the JS Speech SDK, and writing this issue up. I just tested the sample on FF/win11 (westeurope) and Safari/macOS17.4.1 (westus2) with no issues. If you could add the following logging API call to your code and send a screenshot of the console output, that could help with understanding why this is happening for you:

sdk.Diagnostics.SetLoggingLevel(sdk.LogLevel.Debug);
claptimes5 commented 2 months ago

Thanks for the fast response! Here is my output:

image

image

glharper commented 2 months ago

@claptimes5 Status code 1006 is almost always an auth/keys issue. Could you create a new subscription key and see if that works?

claptimes5 commented 2 months ago

Thanks. I tested using eastus and get the same error.

claptimes5 commented 2 months ago

And to make sure, I used the new key/region in Chrome and have no problem.

mattfran commented 2 months ago

This issue also just started happening for me a day or two ago with Firefox (124.2.0) on Android (Pixel 6 Pro).

I tested it with SDK versions 1.32.0 and 1.36.0 and it happens with both.

I made no changes to my code prior to the issue and it works in Chrome on Android (I haven't tested in Safari).

Request

firefox-azure-error

Debug console logs

2024-04-15T04:01:22.006Z | RecognitionTriggeredEvent | privName: RecognitionTriggeredEvent | privEventId: C1C112F903A84376852C951C56BDE197 | privEventTime: 2024-04-15T04:01:22.006Z | privEventType: 1 | privMetadata: {} | privRequestId: 8676DCBF742E4AFEA154267E9732F0E8 | privSessionId: <NULL> | privAudioSourceId: 850C5B4A81DE4A7DAC7111F962911E3E | privAudioNodeId: 88BF53AB042F4A24891EEB598A0124E6 instrument.ts:132
2024-04-15T04:01:22.008Z | ConnectingToServiceEvent | privName: ConnectingToServiceEvent | privEventId: F3637FCA44E84CC6A54C6A6937EB4F12 | privEventTime: 2024-04-15T04:01:22.008Z | privEventType: 1 | privMetadata: {} | privRequestId: 8676DCBF742E4AFEA154267E9732F0E8 | privSessionId: D036C7D0C875473BA8E0FB9CA8D9CCED | privAuthFetchEventid: FA97A5CB25EF49879EA392895F99B7C0 instrument.ts:132
2024-04-15T04:01:22.009Z | AudioStreamNodeAttachingEvent | privName: AudioStreamNodeAttachingEvent | privEventId: BA3AAF30F0274CA4B9BC282C3574EEB8 | privEventTime: 2024-04-15T04:01:22.009Z | privEventType: 1 | privMetadata: {} | privAudioSourceId: 850C5B4A81DE4A7DAC7111F962911E3E | privAudioNodeId: 88BF53AB042F4A24891EEB598A0124E6 instrument.ts:132
2024-04-15T04:01:22.018Z | ConnectionStartEvent | privName: ConnectionStartEvent | privEventId: 535453118F024ABFB4A480D7D1F1E6FB | privEventTime: 2024-04-15T04:01:22.018Z | privEventType: 1 | privMetadata: {} | privConnectionId: D036C7D0C875473BA8E0FB9CA8D9CCED | privUri: wss://eastasia.stt.speech.microsoft.com/speech/recognition/conversation/cognitiveservices/v1?language=en-US&format=simple&profanity=raw&Authorization=Bearer%20[TOKEN REMOVED]&X-ConnectionId=D036C7D0C875473BA8E0FB9CA8D9CCED | privHeaders: <NULL> instrument.ts:132
GETwss://eastasia.stt.speech.microsoft.com/speech/recognition/conversation/cognitiveservices/v1?language=en-US&format=simple&profanity=raw&Authorization=Bearer [TOKEN REMOVED]&X-ConnectionId=D036C7D0C875473BA8E0FB9CA8D9CCED
[HTTP/2 404  2ms]

The connection was refused when attempting to contact wss://eastasia.stt.speech.microsoft.com/speech/recognition/conversation/cognitiveservices/v1?language=en-US&format=simple&profanity=raw&Authorization=Bearer%20[TOKEN REMOVED]&X-ConnectionId=D036C7D0C875473BA8E0FB9CA8D9CCED. WebsocketMessageAdapter.ts:118:43
2024-04-15T04:01:22.293Z | ConnectionErrorEvent | privName: ConnectionErrorEvent | privEventId: 0994ED10C69E4BC2BA8F1260F0FB8645 | privEventTime: 2024-04-15T04:01:22.293Z | privEventType: 0 | privMetadata: {} | privConnectionId: D036C7D0C875473BA8E0FB9CA8D9CCED | privMessage: <NULL> | privType: error instrument.ts:132
2024-04-15T04:01:22.303Z | ConnectionStartEvent | privName: ConnectionStartEvent | privEventId: DE304FBB22BA4D208AD73F6B276F1FE8 | privEventTime: 2024-04-15T04:01:22.303Z | privEventType: 1 | privMetadata: {} | privConnectionId: D036C7D0C875473BA8E0FB9CA8D9CCED | privUri: wss://eastasia.stt.speech.microsoft.com/speech/recognition/conversation/cognitiveservices/v1?language=en-US&format=simple&profanity=raw&Authorization=Bearer%20[TOKEN REMOVED]&X-ConnectionId=D036C7D0C875473BA8E0FB9CA8D9CCED | privHeaders: <NULL> instrument.ts:132
The connection was refused when attempting to contact wss://eastasia.stt.speech.microsoft.com/speech/recognition/conversation/cognitiveservices/v1?language=en-US&format=simple&profanity=raw&Authorization=Bearer%20[TOKEN REMOVED]&X-ConnectionId=D036C7D0C875473BA8E0FB9CA8D9CCED. WebsocketMessageAdapter.ts:118:43
2024-04-15T04:01:22.671Z | ConnectionErrorEvent | privName: ConnectionErrorEvent | privEventId: ABA40D932A9B4700B277CAC7F32A59F3 | privEventTime: 2024-04-15T04:01:22.671Z | privEventType: 0 | privMetadata: {} | privConnectionId: D036C7D0C875473BA8E0FB9CA8D9CCED | privMessage: <NULL> | privType: error instrument.ts:132
2024-04-15T04:01:22.680Z | ConnectionStartEvent | privName: ConnectionStartEvent | privEventId: 894F879552BE47C0A45761B48B444EF8 | privEventTime: 2024-04-15T04:01:22.680Z | privEventType: 1 | privMetadata: {} | privConnectionId: D036C7D0C875473BA8E0FB9CA8D9CCED | privUri: wss://eastasia.stt.speech.microsoft.com/speech/recognition/conversation/cognitiveservices/v1?language=en-US&format=simple&profanity=raw&Authorization=Bearer%20[TOKEN REMOVED]&X-ConnectionId=D036C7D0C875473BA8E0FB9CA8D9CCED | privHeaders: <NULL> instrument.ts:132
2024-04-15T04:01:22.742Z | AudioSourceInitializingEvent | privName: AudioSourceInitializingEvent | privEventId: 67CDCC363F0A4FA68F0A5AC8AF5DF0EB | privEventTime: 2024-04-15T04:01:22.742Z | privEventType: 1 | privMetadata: {} | privAudioSourceId: 850C5B4A81DE4A7DAC7111F962911E3E instrument.ts:132
2024-04-15T04:01:22.756Z | AudioSourceReadyEvent | privName: AudioSourceReadyEvent | privEventId: 39B5795A7BCF4590B6FDD4EA673731EB | privEventTime: 2024-04-15T04:01:22.756Z | privEventType: 1 | privMetadata: {} | privAudioSourceId: 850C5B4A81DE4A7DAC7111F962911E3E instrument.ts:132
2024-04-15T04:01:22.760Z | AudioStreamNodeAttachedEvent | privName: AudioStreamNodeAttachedEvent | privEventId: 9B750BDDB5164E97B4A568C37F3FB522 | privEventTime: 2024-04-15T04:01:22.760Z | privEventType: 1 | privMetadata: {} | privAudioSourceId: 850C5B4A81DE4A7DAC7111F962911E3E | privAudioNodeId: 88BF53AB042F4A24891EEB598A0124E6 instrument.ts:132
2024-04-15T04:01:22.764Z | ListeningStartedEvent | privName: ListeningStartedEvent | privEventId: 00461948DCAA407C921CA6510E2B26AF | privEventTime: 2024-04-15T04:01:22.764Z | privEventType: 1 | privMetadata: {} | privRequestId: 8676DCBF742E4AFEA154267E9732F0E8 | privSessionId: D036C7D0C875473BA8E0FB9CA8D9CCED | privAudioSourceId: 850C5B4A81DE4A7DAC7111F962911E3E | privAudioNodeId: 88BF53AB042F4A24891EEB598A0124E6 instrument.ts:132
The connection was refused when attempting to contact wss://eastasia.stt.speech.microsoft.com/speech/recognition/conversation/cognitiveservices/v1?language=en-US&format=simple&profanity=raw&Authorization=Bearer%20[TOKEN REMOVED]&X-ConnectionId=D036C7D0C875473BA8E0FB9CA8D9CCED. WebsocketMessageAdapter.ts:118:43
2024-04-15T04:01:23.211Z | ConnectionErrorEvent | privName: ConnectionErrorEvent | privEventId: 6EB0281F22D344F38C2CC25FA4CC445A | privEventTime: 2024-04-15T04:01:23.211Z | privEventType: 0 | privMetadata: {} | privConnectionId: D036C7D0C875473BA8E0FB9CA8D9CCED | privMessage: <NULL> | privType: error instrument.ts:132
2024-04-15T04:01:23.215Z | ConnectionStartEvent | privName: ConnectionStartEvent | privEventId: 5670F27ED4604598BF185BBF45A79CC5 | privEventTime: 2024-04-15T04:01:23.215Z | privEventType: 1 | privMetadata: {} | privConnectionId: D036C7D0C875473BA8E0FB9CA8D9CCED | privUri: wss://eastasia.stt.speech.microsoft.com/speech/recognition/conversation/cognitiveservices/v1?language=en-US&format=simple&profanity=raw&Authorization=Bearer%20[TOKEN REMOVED]&X-ConnectionId=D036C7D0C875473BA8E0FB9CA8D9CCED | privHeaders: <NULL> instrument.ts:132
The connection was refused when attempting to contact wss://eastasia.stt.speech.microsoft.com/speech/recognition/conversation/cognitiveservices/v1?language=en-US&format=simple&profanity=raw&Authorization=Bearer%20[TOKEN REMOVED]&X-ConnectionId=D036C7D0C875473BA8E0FB9CA8D9CCED. WebsocketMessageAdapter.ts:118:43
2024-04-15T04:01:23.992Z | ConnectionErrorEvent | privName: ConnectionErrorEvent | privEventId: 32C2990EB1954FEF92571E0765211FDD | privEventTime: 2024-04-15T04:01:23.992Z | privEventType: 0 | privMetadata: {} | privConnectionId: D036C7D0C875473BA8E0FB9CA8D9CCED | privMessage: <NULL> | privType: error instrument.ts:132
2024-04-15T04:01:24.000Z | ConnectionStartEvent | privName: ConnectionStartEvent | privEventId: 5B77513EE2B545959993C466C41D5BFA | privEventTime: 2024-04-15T04:01:24.000Z | privEventType: 1 | privMetadata: {} | privConnectionId: D036C7D0C875473BA8E0FB9CA8D9CCED | privUri: wss://eastasia.stt.speech.microsoft.com/speech/recognition/conversation/cognitiveservices/v1?language=en-US&format=simple&profanity=raw&Authorization=Bearer%20[TOKEN REMOVED]&X-ConnectionId=D036C7D0C875473BA8E0FB9CA8D9CCED | privHeaders: <NULL> instrument.ts:132
The connection was refused when attempting to contact wss://eastasia.stt.speech.microsoft.com/speech/recognition/conversation/cognitiveservices/v1?language=en-US&format=simple&profanity=raw&Authorization=Bearer%20[TOKEN REMOVED]&X-ConnectionId=D036C7D0C875473BA8E0FB9CA8D9CCED. WebsocketMessageAdapter.ts:118:43
2024-04-15T04:01:25.121Z | ConnectionErrorEvent | privName: ConnectionErrorEvent | privEventId: C6ECE4003B464B4D836B30CCF0B7D14E | privEventTime: 2024-04-15T04:01:25.121Z | privEventType: 0 | privMetadata: {} | privConnectionId: D036C7D0C875473BA8E0FB9CA8D9CCED | privMessage: <NULL> | privType: error instrument.ts:132
2024-04-15T04:01:25.123Z | AudioStreamNodeDetachedEvent | privName: AudioStreamNodeDetachedEvent | privEventId: 42D565A34B8F4E01A78DC0C7AD8D59A3 | privEventTime: 2024-04-15T04:01:25.123Z | privEventType: 1 | privMetadata: {} | privAudioSourceId: 850C5B4A81DE4A7DAC7111F962911E3E | privAudioNodeId: 88BF53AB042F4A24891EEB598A0124E6 instrument.ts:132
2024-04-15T04:01:25.124Z | AudioSourceOffEvent | privName: AudioSourceOffEvent | privEventId: BB702A1831174A8A89652A33ADB5F3DE | privEventTime: 2024-04-15T04:01:25.124Z | privEventType: 1 | privMetadata: {} | privAudioSourceId: 850C5B4A81DE4A7DAC7111F962911E3E instrument.ts:132
CANCELED: Reason=0 instrument.ts:132
"CANCELED: ErrorCode=4 instrument.ts:132
"CANCELED: ErrorDetails=Unable to contact server. StatusCode: 1006, undefined Reason:  undefined instrument.ts:132
magic-maker commented 2 months ago

I have encountered the same issue. This arises because the latest version of Firefox (124) has enabled the HTTP2 protocol by default for WebSocket connections. This change leads to a failure when connecting to the Azure Cognitive Service API, resulting in a 404 error as the servers are unable to process the request properly. The connection does not successfully upgrade to a WebSocket connection.

To test this, ensure that the parameter 'network.http.http2.websockets' is set to 'true' in about:config. This is the new default setting which is causing the issue. It's important to note that previously installed versions of Firefox retain this setting as 'false', even after updates.

For your information, this issue does not occur on Chrome on the same machine; it is specific to Firefox and is not related to subscription, authentication, or similar factors.

claptimes5 commented 2 months ago

Thank you @magic-maker that solves my issue with Firefox. Perhaps I'm having a different problem with Safari.

I'm guessing Azure will have to make a change to support this new behavior?

glharper commented 2 months ago

Thank you @magic-maker that solves my issue with Firefox. Perhaps I'm having a different problem with Safari.

I'm guessing Azure will have to make a change to support this new behavior?

Yes, I will let the service team know, and, if corrective actions aren't taken, open an IcM.

@magic-maker, thank you for relaying that information!

glharper commented 2 months ago

Update: The service team is currently testing a fix, approval and deployment are TBD, but probably by end of April at latest.

shareefalis commented 2 months ago

@glharper Any update on the service fix, we are impacted by this bug

glharper commented 2 months ago

@glharper Any update on the service fix, we are impacted by this bug

This appears to be deployed now, just tested on FF with http2 websockets on, and speech recognition works.

shareefalis commented 1 month ago

There is a regression described in #822

kunom commented 1 month ago

@glharper Where do you see this deployed? https://www.npmjs.com/package/microsoft-cognitiveservices-speech-sdk?activeTab=versions, which is referenced in the SDK documentation is still at 1.36.0 which was released 3 months ago.

glharper commented 1 month ago

@glharper Where do you see this deployed? npmjs.com/package/microsoft-cognitiveservices-speech-sdk?activeTab=versions, which is referenced in the SDK documentation is still at 1.36.0 which was released 3 months ago.

@kunom This was a service side fix, no code was changed in JS Speech SDK.