killme2008 / xmemcached

High performance, easy to use multithreaded memcached client in java.
http://fnil.net/xmemcached
Apache License 2.0
757 stars 281 forks source link

TimeoutException causes leakage of connections in several servers #139

Open fooling opened 1 year ago

fooling commented 1 year ago

After connection timeout from the initialization, failed address will be added to a waiting queue:

        this.connector.addToWatingQueue(
            new ReconnectRequest(inetSocketAddressWrapper, 0, getHealSessionInterval()));
        log.error("Connect to " + SystemUtils.getRawAddress(inetSocketAddress) + ":"
            + inetSocketAddress.getPort() + " fail", throwable);

Stacktrace of the exception:

java.util.concurrent.TimeoutException: null

        at com.google.code.yanf4j.core.impl.FutureImpl.get(FutureImpl.java:143) ~[xmemcached-2.4.7.jar:?]

        at net.rubyeye.xmemcached.XMemcachedClient.connect(XMemcachedClient.java:565) [xmemcached-2.4.7.jar:?]

        at net.rubyeye.xmemcached.XMemcachedClient.<init>(XMemcachedClient.java:840) [xmemcached-2.4.7.jar:?]

        at net.rubyeye.xmemcached.XMemcachedClientBuilder.build(XMemcachedClientBuilder.java:362) [xmemcached-2.4.7.jar:?]

such code in MemcachedConnector.java causes the infinite loop:

           try {
              log.info("Trying to connect to " + address.getAddress().getHostAddress() + ":"
                  + address.getPort() + " for " + request.getTries() + " times");
              if (!future.get(MemcachedClient.DEFAULT_CONNECT_TIMEOUT, TimeUnit.MILLISECONDS)) {
                connected = false;
              } else {
                connected = true;
              }
            } catch (TimeoutException e) {
              future.cancel(true);
            } catch (ExecutionException e) {
              future.cancel(true);
            } finally {
              if (!connected) {
                this.rescheduleConnectRequest(request);
              } else {
                continue;
              }
            }
          }

When future.get(MemcachedClient.DEFAULT_CONNECT_TIMEOUT, TimeUnit.MILLISECONDS) timed out 60 seconds , TimeoutException will be thrown ,and future.cancel(true) is called.

But , the underlying connections is actually established fron the netstat , and the cancellation didn't really cancel the connection. So the connection size keeps growing, there were 2000+ ESTABLISHED connections to a single destination, even if connection pool config is default(1).

Network delay is actually within 10ms.

Maybe somewhere blocked in Reactor?