anycable / anycable-go

AnyCable real-time server
https://anycable.io
MIT License
363 stars 65 forks source link

TurboDrive/TurboStream Subscribe/Unsubscribe Race Condition #148

Closed tleish closed 2 years ago

tleish commented 2 years ago

SUMMARY

I've noticed a Subscribe/Unsubscribe race condition with navigating pages using TurboDrive. When leaving a page and going to a new one, an unsubscribe/subscribe event is sent. However, these are sent to the server at the same timestamp in the browser logs.

DETAILS

anycable-go logs sometimes show the following Unsubscribed/Subscribed events:

OK SCENARIO

BROWSER

image

ANYCABLE-GO

20:13:23.860Z context=node sid=... Incoming message: &{unsubscribe {"channel":"Turbo::StreamsChannel"......"} <nil>}
20:13:23.860Z context=node sid=... Unsubscribed from channel: {"channel":"Turbo::StreamsChannel"......"}
20:13:23.860Z channel={"channel":"Turbo::StreamsChannel"......"} context=hub sid=... Unsubscribed

20:13:23.861Z context=node sid=... Incoming message: &{subscribe {"channel":"Turbo::StreamsChannel"......"} <nil>}
20:13:23.861Z context=turbo identifier={"channel":"Turbo::StreamsChannel"......"} verified stream: my_stream
20:13:23.861Z context=node sid=... Subscribed to channel: {"channel":"Turbo::StreamsChannel"......"}
20:13:23.861Z channel={"channel":"Turbo::StreamsChannel"......"} context=hub sid=... stream=my_stream Subscribed

In the above example, the channel is correctly unsubscribed, and then subscribed (last line).

FAIL SCENARIO

Sometimes all the events show in a different order with "Unsubscribe" as last.

BROWSER image

ANYCABLE-GO

20:17:09.463Z context=node sid=... Incoming message: &{unsubscribe {"channel":"Turbo::StreamsChannel"......"} <nil>}
20:17:09.464Z context=node sid=... Unsubscribed from channel: {"channel":"Turbo::StreamsChannel"......"}

20:17:09.464Z context=node sid=... Incoming message: &{subscribe {"channel":"Turbo::StreamsChannel"......"} <nil>}
20:17:09.464Z context=turbo identifier={"channel":"Turbo::StreamsChannel"......"} verified stream: my_stream
20:17:09.464Z context=node sid=... Subscribed to channel: {"channel":"Turbo::StreamsChannel"......"}
20:17:09.465Z channel={"channel":"Turbo::StreamsChannel"......"} context=hub sid=... stream=my_stream Subscribed

20:17:09.466Z channel={"channel":"Turbo::StreamsChannel"......"} context=hub sid=... Unsubscribed

In the first set of logs the Unsubscribed event logged as the 3rd item, but in this last case it comes last. When I see the Unsubscribed last in the log after navigating to a page using TurboDrive, the browser dev tools show the WS still connected, but anycable-go logs indicate "No sessions":

20:00:01.206Z context=pubsub Incoming pubsub message from Redis: {"stream":"my_stream","data":"\"\\u003cturbo-stream action=\\\"append\\\"   ..."}
20:00:01.206Z context=node Incoming pubsub message: &{my_stream "\u003cturbo-stream action=\"append\"   ..."}
20:00:01.206Z context=hub stream=my_stream Broadcast message: "\u003cturbo-stream action=\"append\"   ..."
20:00:01.206Z context=hub stream=my_stream No sessions  

SIMILAR ACTION CABLE CLIENT ISSUE

It might be the same race condition reported/resolved here: ActionCable client ensures subscribe command is confirmed ?

SETUP

palkan commented 2 years ago

Thanks for the report! Looks at the logs, I could say that this is the anycable-go related issue (that's why I transferred it).

Could you please tell me which anycable-go version do you use?

When leaving a page and going to a new one, an unsubscribe/subscribe event is sent

Interesting; shouldn't Turbo preserve streams during navigation, in case we subscribe to the same one? That would make sense. (Though the question for the different project 🙂).

tleish commented 2 years ago

I wondered if this might the server or client.

We are using anycable-go 1.2.0

Shouldn't Turbo preserve streams during navigation, in case we subscribe to the same one? That would make sense.

That’s what I would’ve thought, but if we are using the anycable js client, wouldn’t it manage the stream instead of turbo?

Our application page load times is fairly fast, so perhaps this is what revealed the race condition (a short time between disconnect and connect)?

palkan commented 2 years ago

Our application page load times is fairly fast, so perhaps this is what revealed the race condition

Yeah, it seems.

if we are using the anycable js client, wouldn’t it manage the stream instead of turbo?

It would; but our stream source element is a copy of the Turbo one. And it calls unsubscribe as soon as DOM element is detached. Actually, we can fix it at the client-side (by using channels cache and postponing unsubscribe 'till the next turbo load event or something like that). But still the problem in the server is likely exist (I need to confirm this).

tleish commented 2 years ago

That makes sense why id disconnects/reconnects each time. Not sure what additional overhead this causes.

tleish commented 2 years ago

We reverted our production environment back to ActionCable. Any updates or something I can do to help?

jwako commented 2 years ago

I don't know if it's related, but I've got the same situation. In my case, I don't use TurboDrive.

I could reproduce the same situation with the code below.

class BadgeNotificationChannel < ApplicationCable::Channel
  def subscribed; end

  def unsubscribed; end

  def follow(data)
    stop_all_streams
    stream_from room_key(data)
  end

  def unfollow
    stop_all_streams
  end

  private

  def room_key(data)
    "badges:#{data['channel_id']}"
  end
end

With this code, anycable-go 1.0 is no problem. But when I upgrade to 1.1, it happens and 1.2 also.

When I just delete stop_all_streams in follow method, the problem no longer occurs with anycable-go 1.1 and 1.2. I hope this helps.

palkan commented 2 years ago

Hey @jwako!

What is your client code? Do you perform follow multiple times? And what is "the problem"?

palkan commented 2 years ago

We reverted our production environment back to ActionCable. Any updates or something I can do to help?

Did you switch to Action Cable client or server?

Action Cable server has the same problem with handling incoming messages asynchronously (and concurrently) (as its mentioned in the issue you linked in the description).

I was thinking about handling this server-side, but it couldn't be solved completely without some client-side improvements.

tleish commented 2 years ago

Did you switch to Action Cable client or server?

We switched both server/client back to ActionCable. We didn't think it safe to use AnyCable server with an ActionCable client.

In addition, we have concerns with the Hotwire code inside the AnyCable client getting out of sync. I watched the video and understand the limitations of a plugin approach, but I would prefer to use the main Hotwire libraries with an AnyCable plugin.

I was thinking about handling this server-side, but it couldn't be solved completely without some client-side improvements.

I believe this is why the ActionCable solution was client side.

jwako commented 2 years ago

What is your client code?

I use @rails/actioncable 6.1.4-1 and my client code is below:

      this.badgeChannel = this.consumer.subscriptions.create(this.channelName, {
        connected: () => {
          this.badgeChannel?.perform('follow', {
            channel_id: `${account_id}_${user_id}`
          });
        }
...
      });

Do you perform follow multiple times?

No.

And what is "the problem"?

The problem is the order of Unsubscribed/Subscribed events differs.

When follow calls in my code BadgeNotificationChannel, stop_all_streams should calls Unsubscribe, and stream_from should calls Subscribe. But actually when follow calls, somehow Subscribe calls, and then Unsubscribe calls at last. So I cannot send any badge notification because it is unsubscribed.

anycable-go 1.0 is OK (the order of events is Unsubscribe -> Subscribe). But when I upgraded to 1.1, the order of events is Subscribe -> Unsubscribe even though it is the same code. I could reproduce the different orders 100% with anycable-go 1.1 and 1.2.

palkan commented 2 years ago

We didn't think it safe to use AnyCable server with an ActionCable client.

It's absolutely safe; both clients use the same protocol.

I believe this is why the ActionCable solution was client side.

Yeah, I agree. I'm working on a fix for anycable-client right now.

palkan commented 2 years ago

@jwako Thanks!

That's the problem I was thinking about in the first place (and why I moved the issue here), and it's a server-side race condition. The fix is also on my plate for this week.

tleish commented 2 years ago

It's absolutely safe; both clients use the same protocol.

Can we still use the JWT identification hot-stream without an RPC server approach using the ActionCable client?

tleish commented 2 years ago

Figured out the problem and ActionCable client appears to be working with AnyCable JWT identification.

palkan commented 2 years ago

Can we still use the JWT identification hot-stream without an RPC server approach using the ActionCable client?

Yes, you can. The benefit of using anycable-client is that it allows you to automatically refresh tokens on expiration (see this screencast episode).

palkan commented 2 years ago

The server-side race condition has been fixed and released in 1.2.1. /cc @jwako

Closing this one.

Client-side issue has been moved here: https://github.com/anycable/anycable-client/issues/15 /cc @tleish