aws / amazon-chime-sdk-js

A JavaScript client library for integrating multi-party communications powered by the Amazon Chime service.
Apache License 2.0
699 stars 471 forks source link

Chime's reconnect behaviour causing AudioJoinedFromAnotherDevice? #2784

Open brondibur opened 8 months ago

brondibur commented 8 months ago

What happened and what did you expect to happen?

Got AudioJoinedFromAnotherDevice errors. It seems to occur when we have SignalChannelClosedUnexpectedly errors due to network issues and Chime tries to reconnect.

Have you reviewed our existing documentation?

Reproduction steps

Not sure how to reproduce, but we're seeing SignalChannelClosedUnexpectedly errors due to network issues. Chime tries to reconnect but seems to sometimes cause AudioJoinedFromAnotherDevice errors. In the attached logs, it occurs at 14:08. I can share more examples as well.

Amazon Chime SDK for JavaScript version

3.18.0

What browsers are you seeing the problem on?

Google Chrome

Browser version

117.0.0

Meeting and Attendee ID Information.

Meeting ID: 862fffe2-c523-4c0b-8925-081764852713 Attendee IDs:

Browser console logs

ChimeMeetingLogs2.csv

brondibur commented 8 months ago

What happened and what did you expect to happen?

Got AudioJoinedFromAnotherDevice errors on Electron as well. Also, meeting didn't end after this error, and we continued to get the error multiple times.

Have you reviewed our existing documentation?

Reproduction steps

The meeting started at 6:28 AM UTC, and the AudioJoinedFromAnotherDevice error was first received at 6:39 AM UTC. The device was locked during this time so there are lots of SignalChannelClosedUnexpectedly errors, but it isn't clear why the attendees rejoined. Could Chime have somehow made the attendees rejoin due to network errors, as device wasn't open at this time?

Amazon Chime SDK for JavaScript version

3.18.0

What browsers are you seeing the problem on?

Electron

Browser version

26.2.1

Meeting and Attendee ID Information.

Meeting ID: 64dbad5e-39ac-4915-b393-d391610c2713 Attendee IDs:

Browser console logs

ChimeMeetingLogs.csv

brondibur commented 8 months ago

@hensmi-amazon @pracheth @dinmin-amzn Can someone please look into this?

Yadukrish commented 7 months ago

Any update on this issue? We are experiencing the same issue.

PaulGobin commented 7 months ago

Hey Team- Our customers are experiencing this issue quite often and support tickets keep coming in. Any eta or an emergency patch on this would be much appreciate.

Yadukrish commented 7 months ago

Getting this issue after getting reconnected from meeting(more than 15 sec timeout i think).

Issue only occurs for sdk versions > 16.0.0.

hensmi-amazon commented 7 months ago

Sorry for the delay. SignalChannelClosedUnexpectedly was a new event added in 17.x to make the client reconnect faster in case of irrecoverable network disconnection. AudioJoinedFromAnotherDevice can occur if the connection occurs too quickly before the other, but I'm not sure how that would reach the client since it's connection was closed already. I think there may be some edge cases during long network disconnections to iron out.

For now I will just revert that change so I can get it out in a release we have planned this week.

Since I wish to eventually reroll that commit, can you clarify the experience? Were clients disconnected longer then they expected? Or were they just receiving that event unexpectedly. Did this occur while content was being shared? While video was being shared? I can see in the provided logs but any more patterns would be helpful.

brondibur commented 7 months ago

@hensmi-amazon Our clients share audio, video, and content. The issue occurs regularly, and can be caused by network errors due to actual connectivity issues or if they lock their device mid-meeting. Basically anything that causes WebSocketClosed and triggers Chime's retries. This isn't reproducible on will, but I was able to reproduce it sometimes by briefly turning off wifi during a meeting. I can share more logs if needed.

Yadukrish commented 7 months ago

Usually this happens when clients disconnected more than 15-20 sec. It occurs even without content sharing and video. A single audio attendee is enough to reproduce the issue.

shohei-nozaki commented 7 months ago

There is a simple method to reproduce this issue. You can use the Network Link Conditioner, an extension component in Xcode. Start a call and then change the setting to Very Bad Network. Within 15-30 seconds, switch to Wi-Fi and wait for a few lines. By doing this, the issue can be reproduced.

image