vert-x3 / vertx-redis-client

Redis client for Vert.x
http://vertx.io
Apache License 2.0
131 stars 119 forks source link

Create too many connections to server #365

Closed haohuif closed 1 year ago

haohuif commented 1 year ago

Questions

Reids runs in cluster mode. When userReplicas=always is specified, a large number of connections will be created to the server

After analyzing the source code, I think it is because RequestImpl does not implement the hashCode and equals methods. In the getConnection method of RedisConnectionManager, the request object will be specified to form the ConnectionKey together with the url string of the endpoint. Therefore, each time the connection is obtained through RedisClusterClient, a new connection will be built because the key does not exist

Version

4.2.7

PyAntony commented 1 year ago

I am facing the same issue; I believe this is a bug. The issue is simply resolved by usinguserReplicas=never, so this is definitely happening when using connections to replica nodes. What I have observed is the following:

1) Connection to replica node is used. This connection uses readonly mode (https://github.com/vert-x3/vertx-redis-client/issues/160); if it didn't use readonly we would simply get the MOVED exception. 2) Client gets response successfully, but after that new connections to all cluster nodes are set. This happens with each and every request (I can see this in the logging by using quarkus.log.category."io.vertx.redis".level=DEBUG). 3) On the server side the connections in readonly mode are never closed; they keep piling up. We can track this by using the CLIENT LIST command. There is a point in which the server won't accept more connections; in my case my application just halted (as discussed here: https://github.com/smallrye/smallrye-reactive-messaging/issues/2102).

So the problems That I see are:

Regardless of if and when these issues would be fixed, I think it is a necessity to document this quickly in the site: https://vertx.io/docs/vertx-redis-client/java/. There is a section called Implementing Reconnect on Error, but it doesn't say anything about the fact that you will always get problems with the client in cluster mode. The impression is that (again, for cluster mode only) you simply set up your client by passing all node hosts and you are good to go. This is not the case; the reality is that the client in cluster mode is buggy, or at least it requires additional custom code for it to work properly:

It seems for cluster mode we should always use userReplicas=never, as other modes are unreliable. If we have a query intensive application and want to use replicas then we need to prepare a custom solution.

By the way, these issues have been observed with quarkus 2.16.4. These are the vertx versions I see:

antony@pop-os:~/Desktop/rest-client$ mvn dependency:tree | grep vertx
[INFO] |     |     \- io.vertx:vertx-web-client:jar:4.3.7:compile
[INFO] |     \- io.quarkus.resteasy.reactive:resteasy-reactive-vertx:jar:2.16.4.Final:compile
[INFO] |  +- io.quarkus:quarkus-vertx-http:jar:2.16.4.Final:compile
[INFO] |  |  +- io.smallrye.common:smallrye-common-vertx-context:jar:1.13.2:compile
[INFO] |  |  +- io.quarkus:quarkus-vertx-http-dev-console-runtime-spi:jar:2.16.4.Final:compile
[INFO] |  |  +- io.smallrye.reactive:smallrye-mutiny-vertx-web:jar:2.30.1:compile
[INFO] |  |  |  +- io.smallrye.reactive:smallrye-mutiny-vertx-web-common:jar:2.30.1:compile
[INFO] |  |  |  +- io.smallrye.reactive:smallrye-mutiny-vertx-auth-common:jar:2.30.1:compile
[INFO] |  |  |  +- io.smallrye.reactive:smallrye-mutiny-vertx-bridge-common:jar:2.30.1:compile
[INFO] |  |  |  \- io.smallrye.reactive:smallrye-mutiny-vertx-uri-template:jar:2.30.1:compile
[INFO] |  |  |     \- io.vertx:vertx-uri-template:jar:4.3.7:compile
[INFO] |  |  +- io.vertx:vertx-web:jar:4.3.7:compile
[INFO] |  |  |  +- io.vertx:vertx-web-common:jar:4.3.7:compile
[INFO] |  |  |  +- io.vertx:vertx-auth-common:jar:4.3.7:compile
[INFO] |  |  |  \- io.vertx:vertx-bridge-common:jar:4.3.7:compile
[INFO] |  +- io.quarkus:quarkus-vertx:jar:2.16.4.Final:compile
[INFO] |  |  +- io.quarkus:quarkus-vertx-latebound-mdc-provider:jar:2.16.4.Final:compile
[INFO] |  |  +- io.smallrye.reactive:smallrye-mutiny-vertx-core:jar:2.30.1:compile
[INFO] |  |  |  +- io.smallrye.reactive:smallrye-mutiny-vertx-runtime:jar:2.30.1:compile
[INFO] |  |  |  \- io.smallrye.reactive:vertx-mutiny-generator:jar:2.30.1:compile
[INFO] |  |  |     \- io.vertx:vertx-codegen:jar:4.3.7:compile
[INFO] |  |  \- io.smallrye:smallrye-fault-tolerance-vertx:jar:5.6.0:compile
[INFO] |  +- io.vertx:vertx-kafka-client:jar:4.3.7:compile
[INFO] |  |  \- io.vertx:vertx-core:jar:4.3.7:compile
[INFO] |  \- io.smallrye.reactive:smallrye-mutiny-vertx-redis-client:jar:2.30.1:compile
[INFO] |     \- io.vertx:vertx-redis-client:jar:4.3.7:compile
Ladicek commented 1 year ago

This is fixed on the master branch by #374. Since this issue basically means that connection pooling doesn't work at all, I believe we need to backport the fix to 4.x. I opened #406 to backport the fix to 4.x.

cescoffier commented 1 year ago

@Ladicek should this issue be fixed?

Ladicek commented 1 year ago

Indeed. This is fixed in 4.4.6 (#411), in the 4.x branch (#406) and in master (#374). Closing.