Closed haohuif closed 1 year ago
I am facing the same issue; I believe this is a bug. The issue is simply resolved by usinguserReplicas=never
, so this is definitely happening when using connections to replica nodes. What I have observed is the following:
1) Connection to replica node is used. This connection uses readonly
mode (https://github.com/vert-x3/vertx-redis-client/issues/160); if it didn't use readonly
we would simply get the MOVED
exception.
2) Client gets response successfully, but after that new connections to all cluster nodes are set. This happens with each and every request (I can see this in the logging by using quarkus.log.category."io.vertx.redis".level=DEBUG
).
3) On the server side the connections in readonly
mode are never closed; they keep piling up. We can track this by using the CLIENT LIST
command. There is a point in which the server won't accept more connections; in my case my application just halted (as discussed here: https://github.com/smallrye/smallrye-reactive-messaging/issues/2102).
So the problems That I see are:
readonly
mode, so there is something buggy with the client, maybe a swollen exception somewhere that triggers a recreation of the entire connection pool again and again. @haohuif mentioned that "a new connection will be built", but these new connections built are not coming from the same pool since we would quickly run out of available connections and get the pool exception ("not available connections" or something like that). So my hypothesis is that a new pool is created over and over.Regardless of if and when these issues would be fixed, I think it is a necessity to document this quickly in the site: https://vertx.io/docs/vertx-redis-client/java/. There is a section called Implementing Reconnect on Error, but it doesn't say anything about the fact that you will always get problems with the client in cluster mode. The impression is that (again, for cluster mode only) you simply set up your client by passing all node hosts and you are good to go. This is not the case; the reality is that the client in cluster mode is buggy, or at least it requires additional custom code for it to work properly:
userReplicas=never
but all query requests will go to master nodes, not ideal for query intensive applications.userReplicas=always|share
YOU WILL GET the dreaded MOVED
exception when using multiple shards (many people are facing this issue).It seems for cluster mode we should always use userReplicas=never
, as other modes are unreliable. If we have a query intensive application and want to use replicas then we need to prepare a custom solution.
By the way, these issues have been observed with quarkus 2.16.4
. These are the vertx versions I see:
antony@pop-os:~/Desktop/rest-client$ mvn dependency:tree | grep vertx
[INFO] | | \- io.vertx:vertx-web-client:jar:4.3.7:compile
[INFO] | \- io.quarkus.resteasy.reactive:resteasy-reactive-vertx:jar:2.16.4.Final:compile
[INFO] | +- io.quarkus:quarkus-vertx-http:jar:2.16.4.Final:compile
[INFO] | | +- io.smallrye.common:smallrye-common-vertx-context:jar:1.13.2:compile
[INFO] | | +- io.quarkus:quarkus-vertx-http-dev-console-runtime-spi:jar:2.16.4.Final:compile
[INFO] | | +- io.smallrye.reactive:smallrye-mutiny-vertx-web:jar:2.30.1:compile
[INFO] | | | +- io.smallrye.reactive:smallrye-mutiny-vertx-web-common:jar:2.30.1:compile
[INFO] | | | +- io.smallrye.reactive:smallrye-mutiny-vertx-auth-common:jar:2.30.1:compile
[INFO] | | | +- io.smallrye.reactive:smallrye-mutiny-vertx-bridge-common:jar:2.30.1:compile
[INFO] | | | \- io.smallrye.reactive:smallrye-mutiny-vertx-uri-template:jar:2.30.1:compile
[INFO] | | | \- io.vertx:vertx-uri-template:jar:4.3.7:compile
[INFO] | | +- io.vertx:vertx-web:jar:4.3.7:compile
[INFO] | | | +- io.vertx:vertx-web-common:jar:4.3.7:compile
[INFO] | | | +- io.vertx:vertx-auth-common:jar:4.3.7:compile
[INFO] | | | \- io.vertx:vertx-bridge-common:jar:4.3.7:compile
[INFO] | +- io.quarkus:quarkus-vertx:jar:2.16.4.Final:compile
[INFO] | | +- io.quarkus:quarkus-vertx-latebound-mdc-provider:jar:2.16.4.Final:compile
[INFO] | | +- io.smallrye.reactive:smallrye-mutiny-vertx-core:jar:2.30.1:compile
[INFO] | | | +- io.smallrye.reactive:smallrye-mutiny-vertx-runtime:jar:2.30.1:compile
[INFO] | | | \- io.smallrye.reactive:vertx-mutiny-generator:jar:2.30.1:compile
[INFO] | | | \- io.vertx:vertx-codegen:jar:4.3.7:compile
[INFO] | | \- io.smallrye:smallrye-fault-tolerance-vertx:jar:5.6.0:compile
[INFO] | +- io.vertx:vertx-kafka-client:jar:4.3.7:compile
[INFO] | | \- io.vertx:vertx-core:jar:4.3.7:compile
[INFO] | \- io.smallrye.reactive:smallrye-mutiny-vertx-redis-client:jar:2.30.1:compile
[INFO] | \- io.vertx:vertx-redis-client:jar:4.3.7:compile
This is fixed on the master
branch by #374. Since this issue basically means that connection pooling doesn't work at all, I believe we need to backport the fix to 4.x. I opened #406 to backport the fix to 4.x.
@Ladicek should this issue be fixed?
Indeed. This is fixed in 4.4.6 (#411), in the 4.x branch (#406) and in master (#374). Closing.
Questions
Reids runs in cluster mode. When userReplicas=always is specified, a large number of connections will be created to the server
After analyzing the source code, I think it is because RequestImpl does not implement the hashCode and equals methods. In the getConnection method of RedisConnectionManager, the request object will be specified to form the ConnectionKey together with the url string of the endpoint. Therefore, each time the connection is obtained through RedisClusterClient, a new connection will be built because the key does not exist
Version
4.2.7