Open jiajunsu opened 1 month ago
Here is the modification to reproduce this issue.
https://github.com/redis/lettuce/commit/01f96a677b9036bfd04afb8c392c7f32cc1f49c4
Hey @jiajunsu ,
Thanks for the verbose analysis. Before we jump into the way the Lettuce driver works let's first analyse the scenario and environment you are working with and why sentinel returns an empty list of nodes in the first place.
How many sentinel processes are you using to test this scenario? Can you describe your complete environment?
@tishun
Topology refresh returned no nodes
).
public Mono<List<RedisNodeDescription>> getNodes(RedisURI seed) {
CompletableFuture<List<RedisNodeDescription>> future = topologyProvider.getNodesAsync();
Mono<List<RedisNodeDescription>> initialNodes = Mono.fromFuture(future).doOnNext(nodes -> {
applyAuthenticationCredentials(nodes, seed);
});
return initialNodes.map(this::getConnections) // if initialNodes is empty, map `getConnections` will be skipped
.flatMap(asyncConnections -> asyncConnections.asMono(seed.getTimeout(), eventExecutors))
...
}
And the empty list was made from connections.requestPing()
. The detail is:
+sdown
to lettuce client.PING
the nodes, and at the same time, the redis nodes are unreachable from network. SentinelConnector#getTopologyRefreshRunnable
This could be reproduced by running MyTest
in my forked repo and commit. Just make the test environment by make prepare
and make start
, and the testcase use the sentinel in lettuce unittest environment running with port 26379
and 26380
, while the upstream-replica ports are 6482
and 6483
.
Oh, okay, let me address these one at the time:
With 2 sentinel, 1 upstream and 1 replica nodes running.
This is not a recommended setup for Redis Sentinel as you have to have at least 3 sentinel nodes to achieve master promotion and avoid split brain scenarios. See Example 1 and the explanation why this is a broken setup.
The empty list of nodes is not replied by sentinel, I've tested that possibility, and found that empty-list could not be responsed from sentinel(even if sentinel response empty list, lettuce code could not reach the WARN log Topology refresh returned no nodes).
My bad, confused topology refresh with master/replica refresh. There are two types of topology refresh (check the SentinelConnector:connectAsync() method):
And the empty list was made from connections.requestPing(). The detail is:
- sentinel send +sdown to lettuce client.
- lettuce try to get nodes list from sentinel and success.
- lettuce try to PING the nodes, and at the same time, the redis nodes are unreachable from network.
- at last we get an empty list in SentinelConnector#getTopologyRefreshRunnable
I think you are simulating an invalid state of the system, which the driver could not (reasonably) handle. The fix in #3019 is also not valid, because we will end up with a list of nodes that we know we cant connect to. At the moment the refresh is happening, there is basically no valid node to connect to, as the masters returned from the NODES
command are not returning a reply to our PING
. The only thing the driver could do is wait until the next refresh attempt. What else could it do? If it attempts to send commands to the last known nodes they are sure to fail, as they are not reachable.
At this point I think this is an invalid scenario to consider.
This is not a recommended setup for Redis Sentinel as you have to have at least 3 sentinel nodes to achieve master promotion and avoid split brain scenarios. See Example 1 and the explanation why this is a broken setup.
Sorry for the misleading response. In prod, we do have 3 sentiniel nodes. The 2 running sentinel is in my test env for reproducing this issue.
At the moment the refresh is happening, there is basically no valid node to connect to, as the masters returned from the NODES command are not returning a reply to our PING. The only thing the driver could do is wait until the next refresh attempt. What else could it do? If it attempts to send commands to the last known nodes they are sure to fail, as they are not reachable.
IMHO, a better way may could be:
PING
, we could remove them from KnownNodes
. Possibly those nodes are fault.PING
, we could just do nothing rather than clear the KnownNodes
list. Possibly it's not the redis nodes' problem, may the lettuce process node's network being broken. After we clear KnownNodes
, if the sentinel don't send any psub message, the lettuce client could not get recovered ever.I just compared the lettuce features between redis-cluster and redis-sentinel, maybe we could support enablePeriodicRefresh
feature in sentinel mode, just like what the cluster mode do. That could handle this scenario in a better way.
Sorry for the misleading response. In prod, we do have 3 sentiniel nodes. The 2 running sentinel is in my test env for reproducing this issue.
I understand. I do - however - feel this would not happen if you use 3 sentinel nodes. Have you tried reproducing the problem this way?
IMHO, a better way may could be:
- If some nodes(not all nodes) do not reply to
PING
, we could remove them fromKnownNodes
. Possibly those nodes are fault.- If all nodes do not reply to
PING
, we could just do nothing rather than clear theKnownNodes
list. Possibly it's not the redis nodes' problem, may the lettuce process node's network being broken. After we clearKnownNodes
, if the sentinel don't send any psub message, the lettuce client could not get recovered ever.
knownNodes
as they are. If we do that it would mean that we assume that the knownNodes
would eventually come up - and we have no way of knowing that. Here is the modification to reproduce this issue.
This test scenario simply simulates what would happen if a PING
command fails. I am not sure why the Sentinel would return a node that is unreachable. Could you verify that somehow? Could it be that the master is never promoted?
I just compared the lettuce features between redis-cluster and redis-sentinel, maybe we could support
enablePeriodicRefresh
feature in sentinel mode, just like what the cluster mode do. That could handle this scenario in a better way.
Assuming we add this periodic refresh - which node should we ask for what type of information? Currently we depend on the Sentinel to push data about all master nodes, and all master nodes to give us master/replica status. If we are to believe your research the Sentinel pushes a topology refresh, giving us a master node and when we attempt to connect to it we get a list of nodes that we can't PING.
I do - however - feel this would not happen if you use 3 sentinel nodes. Have you tried reproducing the problem this way?
The original case was happend in our production environment, with 3 sentinel nodes. And I tried to reproduce it by dropping down the replica and 2 sentinel nodes, it didn't happen in about 100 times. That means the trigger condition is too extream to reproduce in real environment. So I have to inject exceptions to lettuce code and test it in unittest env.
WHY is there a situation where no nodes are returned, and I believe this is an invalid state of the system. I disagree we should leave the existing knownNodes as they are. If we do that it would mean that we assume that the knownNodes would eventually come up - and we have no way of knowing that.
Because I failed to reproduce it in real env, and without debug logs, it's hard to get the detail reason why PING
command fails. After reading and testing the lettuce code, it seems to be just one possible cause that could make that WARN log Topology refresh returned no nodes
. Considering the network latency between lettuce client and redis sentinel/master, a likely scenario maybe the network between lettuce and sentinel/master was down just after the psub message being sent from sentinel. But it's just a guess.
I also checked the redis sentinel logs, and it records the psub messages was sent at that time.
I agree that leaving knownNodes
with unreachable data is not a good idea. Clearing knownNodes
and waiting for next psub message to refresh the topology data is not a good choice either. So maybe supporting periodicRefresh
is a better choice.
Assuming we add this periodic refresh - which node should we ask for what type of information?
We could make it the same way as the callback function for psub message, SentinelConnector#getTopologyRefreshRunnable
, getting master and slave nodes from sentinel. Differ from just receiving psub message from sentiniel passively, we could refresh
knownNodes
periodically. That means, if that extream scenario happened, the lettuce client could get to work just after the network recovered, without doing anything by SRE(like restarting the client process, that is what we have to do in this case).
This is a fair point. We could improve the resiliency of the driver if we schedule some action in this particular scenario.
I think we could model it in a similar manner to the ClusterTopologyRefreshScheduler
.
This is no trivial task however, I will have to think on it's priority given our backlog and given the fact it is a bit of an exotic issue. Meanwhile contributions are welcome in that direction.
Bug Report
Current Behavior
In redis sentinel mode, lettuce may refresh topology nodes to empty, if the connection to redis sentinel closed just after lettuce received TopologyRefreshMessage. And that will cause lettuce cannot recover anymore until received next TopologyRefreshMessage.
We have two redis nodes, and redis is running in sentinel mode. Assume the redis nodes' names are redis1 and redis2, and redis1 is master at the beginning, we inject errors as below:
At step3, redis sentinel send
+sdown sentinel ...
to lettuce client, and trigger lettuce executing methodSentinelConnector::getTopologyRefreshRunnable
. While at the same time, the connection between lettuce client and redis sentinel is closed, and then error occured.lettuce logs below
And since that, the app could not handle any read and write operation because the knownNodes is empty.
By reviewing the git log, we've found this problem existed since commit Do not fail Master/Slave topology refresh if a node is not available. And it still exists at branch master;
When
Requests#getRequest
returnnull
orfuture.isDone()
returns false, it'll clearMasterReplicaConnectionProvider#knownNodes
and the lettuce client could not get recovered untill receiving next TopologyRefreshMessage, or the lettuce client process was restarted.src/main/java/io/lettuce/core/masterreplica/Requests.java
```java protected void onEmit(Emission> emission) { List result = new ArrayList<>();
Map latencies = new HashMap<>();
for (RedisNodeDescription node : nodes) {
TimedAsyncCommand future = getRequest(node.getUri());
if (future == null || !future.isDone()) {
// if this expression is true for all nodes, it'll clear MasterReplicaConnectionProvider#knownNodes
continue;
}
RedisNodeDescription redisNodeDescription = findNodeByUri(nodes, node.getUri());
latencies.put(redisNodeDescription, future.duration());
result.add(redisNodeDescription);
}
SortAction sortAction = SortAction.getSortAction();
sortAction.sort(result, new LatencyComparator(latencies));
emission.success(result);
}
```
Environment
Possible Solution
We've checked the lettuce code, the code in
SentinelConnector::getTopologyRefreshRunnable
may be improved, to avoid setting empty list to knownNodes.src/main/java/io/lettuce/core/masterreplica/SentinelConnector.java
```java private Runnable getTopologyRefreshRunnable(MasterReplicaTopologyRefresh refresh, MasterReplicaConnectionProvider