aerospike / aerospike-client-java

Aerospike Java Client Library
Other
236 stars 212 forks source link

Connection pool question #127

Closed Aloren closed 5 years ago

Aloren commented 5 years ago

We are using blocking Aerospike Client Api ver. 4.3.2. In case we have requests spike on service -- client creates a lot more connections than service would need during normal load. As soon spike is gone we expect connection pool to decrease in size, but this is not happening. image

According to com.aerospike.client.cluster.Node connections are taken from the head of the queue, after used put back in the tail. After each connection usage, it's lastUsed time is updated. Whether connection is valid is determined by this code:

    public boolean isValid() {
        return (System.nanoTime() - lastUsed) <= maxSocketIdle;
    }

If that returns false -- connection is closed and removed from the pool. By default maxSocketIdle is 55 seconds, it means that for at least one connection to become invalid -- application needs to traverse the whole queue in 55 seconds. That is quite a long period of time and as a result we see a big number of connections to Aerospike until we restart the service. What would you recommend doing in this situation? Would be decrease of maxSocketIdle to ~5 seconds a good solution or not?

I've created simple test case that reproduces this situation:

    @Test
    public void numbOfConnections() throws Exception {
        ExecutorService service = Executors.newFixedThreadPool(300);

        List<Future<Void>> futures = service.invokeAll(Collections.nCopies(300, (Callable<Void>) () -> {
            for (int i = 0; i < 10; i++) {
                client.get(null, new Key(namespace, "set", "key"));
//              if (i == 9) {
//                  System.out.println(client.getClusterStats());
//              }
            }
            return null;
        }));

        futures.forEach(this::get);
        System.out.println("----- CASE 1 DONE ------");
        System.out.println(new Date() + " " + client.getClusterStats());

        futures = service.invokeAll(Collections.nCopies(2, (Callable<Void>) () -> {
            for (int i = 0; i < 7_000; i++) {
                client.get(null, new Key(namespace, "set", "key"));
                Thread.sleep(10);
            }
            return null;
        }));
        futures.forEach(this::get);
        System.out.println("----- CASE 2 DONE ------");
        System.out.println(new Date() + " " + client.getClusterStats());
    }

    private void get(Future<Void> f) {
        try {
            f.get();
        } catch (Exception e) {
            throw new RuntimeException(e);
        }
    }

As you can see from the output after 7000 iterations * 10 ms = 70 sec, pool did not change:

----- CASE 1 DONE ------
Wed Feb 13 20:50:05 EET 2019 nodes (inUse,inPool):
BB9030011AC4202 127.0.0.1 32857 sync(0,280) async(0,0)
threadsInUse: 0
recoverQueueSize: 0

----- CASE 2 DONE ------
Wed Feb 13 20:51:41 EET 2019 nodes (inUse,inPool):
BB9030011AC4202 127.0.0.1 32857 sync(0,280) async(0,0)
threadsInUse: 0
recoverQueueSize: 0
BrianNichols commented 5 years ago

The current solution is to reduce maxSocketIdle. We will investigate using a LIFO stack instead of the current FIFO queue for the connection pools. The LIFO stack implementation will trim unused connections more aggressively because older connections are used much less than active connections.

BrianNichols commented 5 years ago

Java client 4.3.1 has been released which use a LIFO stack for all connection pools. Expired connections are closed/removed once every 30 tend iterations (~30 seconds). Your test code now results in only 2 sync connections left in the pool when case 2 completes.