redis / lettuce

Advanced Java Redis client for thread-safe sync, async, and reactive usage. Supports Cluster, Sentinel, Pipelining, and codecs.
https://lettuce.io
MIT License
5.39k stars 968 forks source link

`resubscribe` logic looks ignoring sharded Pub/Sub (`shardChannels` Set of `PubSubEndpoint`) #2940

Open saiya opened 2 months ago

saiya commented 2 months ago

Bug Report

Thank you for adding support for sharded PubSub in the 6.4.0. Related to that implementation, I would like to confirm about resubscribe (subscription recovery after reconnection) behavior. It looks like resubscribe handling only non-sharded PubSub subscriptions.

Current Behavior

Sharded PubSub subscriptions looks not recovered after disconnection + re-connection, despite Lettuce automatically recovers non-sharded PubSub subscriptions (the recovery behavior is very helpful for applications, and also difficult / error-prone to implement by ourselves).

In the https://github.com/redis/lettuce/commit/1457486165edb8e7184c0214dae9973c7f6ff567#diff-73f0c5ac0f7d6d94f8d47e1a75564aa344cbbbd4c91ce786711b5c3673044956, PubSubEndpoint.shardChannels (private) have been added but it looks that variable is private inside PubSubEndpoint and no any code of PubSubEndpoint reading that variable's data. Also implementations of StatefulRedisPubSubConnectionImpl.resubscribe and StatefulRedisClusterPubSubConnectionImpl.resubscribe does not accessing shardChannels.

Input Code

Just make sharded Pub/Sub connection, and wait until some disconnection + re-connection occurs on that environment...

Input Code ```java StatefulRedisPubSubConnection connection = client.connectPubSub() connection.addListener(new RedisPubSubAdapter() { ... }) RedisPubSubCommands sync = connection.sync(); sync.ssubscribe("channel"); // Sharded subscription ```

Expected behavior/code

As same as non-sharded Pub/Sub, automatically recover sharded Pub/Sub subscriptions when possible to connect to that shard/node. ( May be a bit different logic from non-sharded Pub/Sub needed. Because the SSUBSCRIBE requires living connection to the particular shard's node(s) )

Environment

Possible Solution

(Not tested) Watch the connection's state somehow and manually call ssubscribe again by application's logic.

Additional context

Because currently I don't have full access to a target environment / Redis cluster, this issue is based on best guess by reading code. I hope this code reading makes some sense...

saiya commented 2 months ago

(Just related topic) Also Javadoc of RedisClusterClient#connectPubSub and StatefulRedisClusterPubSubConnection are bit conflicting each other. I assume the StatefulRedisClusterPubSubConnection one is correct:

RedisClusterClient#connectPubSub:

 * <li>Pub/sub commands are sent to the node with the least number of clients</li>

StatefulRedisClusterPubSubConnection:

The connection provides transparent command routing based on the first command key.

↑ By reading code, this one (routing based on first command key) is the truth and that is why ssubscribe (but also subscribe, despite it can be sent to any shard/node) implementation always sends command to the key slot's shard/node

tishun commented 2 months ago

Hey @saiya , thanks for the report, I think you are correct and sharded subscriptions are not resubscribed right now. I will put this in our backlog so we can address it as soon as time permits.

tishun commented 2 months ago

By reading code, this one (routing based on first command key) is the truth and that is why ssubscribe (but also subscribe, despite it can be sent to any shard/node) implementation always sends command to the key slot's shard/node

Both, to the best of my own understanding:

Does that make sense?

saiya commented 2 months ago

Thank you for the agile response! I will await for the fix.

saiya commented 2 months ago

Both, to the best of my own understanding:

  • channel names are keys and the command will be send to a node handling the slot that handles this key
  • in case more than one shard is handling this slot (replica nodes) we subscribe to the one with the least number of clients Does that make sense?

I see; yes indeed that make sense.

Now I understand the RedisClusterClient#connectPubSub document's intention; Another list item saying <li>Keyless commands are send to the default connection</li>, therefore that list item talking about not-keyless command implictly.