aws / amazon-chime-sdk-js

A JavaScript client library for integrating multi-party communications powered by the Amazon Chime service.
Apache License 2.0
704 stars 475 forks source link

Unable to catch failed websocket connections #2720

Open kelvin2200 opened 1 year ago

kelvin2200 commented 1 year ago

What happened and what did you expect to happen?

Certain clients are behind custom firewall and/or antivirus configurations. Usually when this is the case, and the Chime endpoints require whitelisting, the first thing that fails (among others) is the connection to the chime signaling service. While there are hacks and workarounds to catch that error in the browser, it would be nice for the SDK to have a listener we can use to detect that specific scenario and show the user a custom message.

Maybe such a listener can be added and we haven't yet found out how. Please correct me if I am wrong.

Have you reviewed our existing documentation?

Reproduction steps

add these 2 entries to: /etc/hosts (linux) 127.0.0.1 signal.m2.ec1.app.chime.aws 127.0.0.1 data.svc.ue1.ingest.chime.aws

and start a meeting

Amazon Chime SDK for JavaScript version

3.15.0

What browsers are you seeing the problem on?

all

Browser version

all

Meeting and Attendee ID Information.

No response

Browser console logs

[WARN] - Chime: stopped pinging (WebSocketFailed) [WARN] - Chime: will retry due to status code TaskFailed and error: serial group task AudioVideoStart/8bcbfa41-73d5-468e-9658-a3ae158d7979/d9cbd654-f27e-b7d2-ae94-5670c2a2c690 was canceled due to subtask AudioVideoStart/8bcbfa41-73d5-468e-9658-a3ae158d7979/d9cbd654-f27e-b7d2-ae94-5670c2a2c690/Timeout15000ms error: serial group task AudioVideoStart/8bcbfa41-73d5-468e-9658-a3ae158d7979/d9cbd654-f27e-b7d2-ae94-5670c2a2c690/Timeout15000ms/Peer was canceled due to subtask AudioVideoStart/8bcbfa41-73d5-468e-9658-a3ae158d7979/d9cbd654-f27e-b7d2-ae94-5670c2a2c690/Timeout15000ms/Peer/SubscribeAndReceiveSubscribeAckTask (once) error: serial group task Signaling was canceled due to subtask Signaling/Timeout15000ms (once) error: WebSocket connection failed

WebSocket connection to 'wss://signal.m2.ec1.app.chime.aws/control/8bcbfa41-73d5-468e-9658-a3ae158d7979?X-Chime-Control-Protocol-Version=3&X-Amzn-Chime-Send-Close-On-Error=1&X-Amzn-Version=3.14.1&X-Amzn-User-Agent=chrome-114' failed: create @ DefaultWebSocketAdapter.ts:15 serviceConnectionRequestQueue @ DefaultSignalingClient.ts:374 21:35:45.053 ConsoleLogger.ts:79 2023-07-27T18:35:45.053Z [ERROR] Chime - failed to connect

michhyun1 commented 1 year ago

Right now I think the preferred way to listen to failure events on the meeting is via the https://aws.github.io/amazon-chime-sdk-js/interfaces/audiovideoobserver.html.

We also have metricsDidReceive observer that you should check out: https://github.com/aws/amazon-chime-sdk-js/blob/main/guides/17_Migration_to_3_0.md#:~:text=const%20observer%20%3D%20%7B%0A%20%20oldSendBandwidthKbs,%7D%0A%20%20%7D%2C%0A%7D%3B

kelvin2200 commented 1 year ago

@michhyun1 OK, we know about the audiovideoObserver, and client metrics, but:

  1. a user may not have a video device at all
  2. the observers will behave the same way when having a poor connection and there is packet loss

what would be needed is something that says specifically that the client cannot connect to the WS

michhyun1 commented 1 year ago

I think the closet thing we have to that is https://aws.amazon.com/blogs/business-productivity/monitoring-and-troubleshooting-with-amazon-chime-sdk-meeting-events/

We throw a TaskFailed meeting event when we are unable to connect to the WS.

however, taskFailed can mean multiple things, not just WS connection failure. I'm not 100% sure if there might be some metadata within a meeting event that shows whether or not it was caused due to the opensignalingtask failing as opposed to some other task failing.

lvillacin commented 3 months ago

Hi! As I understand from the above conversation it seems that there isn't a sure-fire way to identify if WS is blocked.

@kelvin2200 would you be able to provide some help/guidance on how to whitelist the aws chime sdk in a network?

I found this article but a little unsure on the steps we have to partake in: https://docs.aws.amazon.com/chime-sdk/latest/dg/network-config.html#media-signaling

We don't handle the network configuration and frankly am a bit unfamiliar and was wondering what we can provide to the network team so a little push in the right direction would be extremely helpful.

Also, @michhyun1 how would we catch TaskFailed using the React Components library?

Thank you!