GetStream / stream-chat-react

React Chat SDK ➜ Stream Chat 💬
https://getstream.io/chat/sdk/react/
Other
698 stars 272 forks source link

bug: whole web page occasionally crashes if the user is added to a new channel and immediately receives a message from that channel. #2474

Open linhewinner opened 4 weeks ago

linhewinner commented 4 weeks ago

Describe the bug

The whole web page occasionally crashes if the user is added to a new channel and immediately receives a message from that channel.

To Reproduce

Steps to reproduce the behavior:

  1. Have ChannelList rendered on the web page
  2. Add the user to a chat channel; and immediately send a message into that channel.
  3. About 10% of the time, the user would see the whole web page crash.

Please read my "Additional Context" to understand what's going on and why this is only a 10% repro.

Expected behavior The web page should not crash.

Package version

Additional context

I know that if I reported a bug that only reproduces 10% of the time, my bug report would probably get buried. I took it upon myself to debug the issue and found the root cause. I will explain my findings and suggest a pretty obvious fix. I hope this can speed up Stream prioritizing this bug.

The stack trace of the crash:

Error: Channel messaging:clzvqpw4k005801s6f8y68lm0 hasn't been initialized yet. Make sure to call .watch() and wait for it to resolve
    at Channel._checkInitialized (browser.es.js:3895:15)
    at Channel.muteStatus (browser.es.js:2674:12)
    at useIsChannelMuted (useIsChannelMuted.js:12:68)
    at ChannelPreview (ChannelPreview.js:55:90)
    at renderWithHooks (react-dom.development.js:2719:157)
    at mountIndeterminateComponent (react-dom.development.js:3293:1445)
    at beginWork (react-dom.development.js:3632:93)
    at HTMLUnknownElement.callCallback (react-dom.development.js:730:119)
    at HTMLUnknownElement.sentryWrapped (helpers.js:90:17)
    at Object.invokeGuardedCallbackDev (react-dom.development.js:750:45)
    at invokeGuardedCallback (react-dom.development.js:771:126)
    at beginWork$1 (react-dom.development.js:4698:1)
    at performUnitOfWork (react-dom.development.js:4514:150)
    at workLoopSync (react-dom.development.js:4500:30)
    at renderRootSync (react-dom.development.js:4496:159)
    at recoverFromConcurrentError (react-dom.development.js:4375:170)
    at performConcurrentWorkOnRoot (react-dom.development.js:4339:126)
    at workLoop (scheduler.development.js:227:38)
    at flushWork (scheduler.development.js:205:18)
    at MessagePort.performWorkUntilDeadline (scheduler.development.js:442:25)

How the code is supposed to work:

  1. ChannelList listens to "notification.added_to_channel" event via useNotificationAddedToChannelListener, and also to "message.new" event via useMessageNewListener.
  2. "notification.added_to_channel" event fires.getChannel will call .watch() on the channel.
  3. The .watch() call eventually resolves, and then the watched channel is added to the ChannelList to render (https://github.com/GetStream/stream-chat-react/blob/afcd40fbd95cef7ec1237babb40a3460926618a8/src/components/ChannelList/hooks/useNotificationAddedToChannelListener.ts#L29)
  4. "message.new" event fires. the channel is added directly to the ChannelList to render. There is no call to .watch (https://github.com/GetStream/stream-chat-react/blob/afcd40fbd95cef7ec1237babb40a3460926618a8/src/components/ChannelList/hooks/useMessageNewListener.ts#L34-L35)

90% of the time the events happen in the order of 1, 2, 3, 4. There is no crash because in step 4, the channel is already watched (aka initialized). But 10% of the time, the events happen in the order of 1, 2, 4, 3. In 4, an uninitialized channel is added to ChannelList, resulting in the stack trace of crash above.

Now, why would 4 happen before 3? Isn't "message.new" supposed to fire AFTER a channel is watched? The race condition happens because .watch() is asynchronous. Somewhere in the middle of that call, the web socket is established and the "message.new" event comes down BEFORE the whole .watch() is resolved.

Obviously this is a race condition that doesn't happen all the time. It really depends on how soon the message is sent after the user is added to the channel. It also depends on how much time elapses between the establishment of web socket and finishing the rest of .watch() call. In our product, every user goes through a flow in onboarding where they add themselves into a chat channel and we immediately send a message when you are added. About 10% of the times, a user's first impression of our product is a crashed web page ....

A simple fix would be to call getChannel in useMessageNewListener

MartinCupela commented 3 weeks ago

Hey @linhewinner , it would not work if getChannel was called in useMessageNewListener as message.new is not delivered for a channel that is not watched. The WS is established from the beginning of the session and it is true, that WS can be sometimes faster the HTTP request completion.

In your stack trace I see the invocation of useIsChannelMuted hook. This hook requests a channel's mute status as a reaction to notification.channel_mutes_updated event. Do you update mutes on each member addition?

linhewinner commented 3 weeks ago

Hi @MartinCupela

Thank you for the reply.

re: The WS is established from the beginning of the session and it is true, that WS can be sometimes faster the HTTP request completion.

Thanks for the confirmation. Sounds like we are on the same page that this is a plausible race condition.

re: Do you update mutes on each member addition?

No.

re: useIsChannelMuted ... requests a channel's mute status as a reaction to notification.channel_mutes_updated event

This statement is not accurate according to my reading/debugging of the code. This hook is used here in ChannelPreview unconditionally (as hooks must be called unconditionally): https://github.com/GetStream/stream-chat-react/blob/afcd40fbd95cef7ec1237babb40a3460926618a8/src/components/ChannelPreview/ChannelPreview.tsx#L86 Every time a new channel is added to ChannelList, the rendering code would call useIsChannelMuted unconditionally.

re: it would not work if getChannel was called in useMessageNewListener as message.new is not delivered for a channel that is not watched

Not sure if I follow your logic. See the following attached screenshot. As a matter of fact, I use getChannel inside the handler, and it works like a charm. As a workaround, in our codebase I override the onMessageNewHandler callback. I call getChannel, which perfectly avoids the race condition (I used console.log to verify inside this handler that race condition induced crash would have happened if I hadn't called getChannel)

Screenshot 2024-08-21 at 10 29 03 AM