redis / jedis

Redis Java client
https://redis.github.io/jedis/
MIT License
11.9k stars 3.87k forks source link

Eviction gives "Could not return the broken resource to the pool" even if all resources are fine #2710

Closed steinrr closed 1 month ago

steinrr commented 3 years ago

Expected behavior

I have a Spring Boot application where I loop until a key is found in Redis (Azure Redis). Everything works fine, and there are no issues. I run a load test with 1250 requests to the REST interface and all of them succeeds OK. There are no exceptions thrown from either getting resource or getting key - everything looks fine.

But when the all requests has finished, and there are currently no requests, I find these in my logs:

2021-12-01 10:03:19,764 WARN redis.clients.jedis.JedisFactory [commons-pool-evictor] Error while close
redis.clients.jedis.exceptions.JedisException: Could not return the broken resource to the pool
    at redis.clients.jedis.util.Pool.returnBrokenResourceObject(Pool.java:126)
    at redis.clients.jedis.util.Pool.returnBrokenResource(Pool.java:103)
    at redis.clients.jedis.JedisPool.returnResource(JedisPool.java:382)
    at redis.clients.jedis.JedisPool.returnResource(JedisPool.java:15)
    at redis.clients.jedis.Jedis.close(Jedis.java:4067)
    at redis.clients.jedis.JedisFactory.destroyObject(JedisFactory.java:166)
    at org.apache.commons.pool2.PooledObjectFactory.destroyObject(PooledObjectFactory.java:127)
    at org.apache.commons.pool2.impl.GenericObjectPool.destroy(GenericObjectPool.java:611)
    at org.apache.commons.pool2.impl.GenericObjectPool.evict(GenericObjectPool.java:729)
    at org.apache.commons.pool2.impl.BaseGenericObjectPool$Evictor.run(BaseGenericObjectPool.java:160)
    at org.apache.commons.pool2.impl.EvictionTimer$WeakRunner.run(EvictionTimer.java:113)
    at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
    at java.base/java.util.concurrent.FutureTask.runAndReset(Unknown Source)
    at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
    at java.base/java.lang.Thread.run(Unknown Source)
Caused by: java.lang.IllegalStateException: Invalidated object not currently part of this pool
    at org.apache.commons.pool2.impl.GenericObjectPool.invalidateObject(GenericObjectPool.java:936)
    at org.apache.commons.pool2.impl.GenericObjectPool.invalidateObject(GenericObjectPool.java:914)
    at redis.clients.jedis.util.Pool.returnBrokenResourceObject(Pool.java:124)
    ... 16 common frames omitted

I don't understand that I can have a broken resource - everything worked fine and all my Jedis was able to get the keys.

@Configuration class:

    @Bean
    JedisPool jedisPool() {
        JedisPoolConfig poolConfig = buildPoolConfig();
        return new JedisPool(poolConfig, REDISCACHEHOSTNAME, 6379, 10, REDISCACHEKEY);

    }

    private JedisPoolConfig buildPoolConfig() {
        final JedisPoolConfig poolConfig = new JedisPoolConfig();
        poolConfig.setMaxTotal(128);
        poolConfig.setMaxIdle(128);
        poolConfig.setMinIdle(16);
        poolConfig.setTestOnBorrow(true);
        poolConfig.setTestOnReturn(true);
        poolConfig.setTestWhileIdle(true);
        poolConfig.setMinEvictableIdleTimeMillis(Duration.ofSeconds(60).toMillis());
        poolConfig.setTimeBetweenEvictionRunsMillis(Duration.ofSeconds(30).toMillis());
        poolConfig.setNumTestsPerEvictionRun(3);
        poolConfig.setBlockWhenExhausted(true);
        return poolConfig;
    }

NB! I see that the setMinEvictableIdleTimeMillis and setTimeBetweenEvictionRunsMillis are deprecated - can this be the issue?

@Restcontroller class:

        Jedis jedis = null;

        try  {
            jedis = localJedisPool.getResource();
        } catch (Exception e) {
            logger.error("Could not get REDIS instance for session id: " + sessionid);
        }

        boolean foundKey = false;
        int i = 0;

        String response = null;

        do {
            response = jedis.get(sessionid);
            if (response == null) {
                i++;
                logger.info("Failed REDIS key for session id: " + sessionid + ". Retry #" + i);
                try {
                    Thread.sleep(25);
                } catch (InterruptedException ie) {
                    Thread.currentThread().interrupt();
                }
            } else {
                foundKey = true;
                logger.info("Found REDIS key for session id: " + sessionid + " AFTER retry #" + i);
            }

        } 
        while (foundKey == false && i<20);

        localJedisPool.returnResource(jedis);

NB! The code is "quick and dirty" and I could probably use pub/sub with keyspace notification, but this is a proof-of-concept app and "does the job". It checks for the existence of a key. If not found, it will pause 25 ms and try once more.

Actual behavior

See above stacktrace.

Steps to reproduce:

See above code

Redis / Jedis Configuration

See code.

Jedis version:

3.7.0

Redis version:

Azure Redis cache

Java version:

11

steinrr commented 3 years ago

I did some more testing. I removed those two deprecated config entries, and changed from .returnResource to .close. It still works fine, but now I am seeing these in the logs.

The config now looks like this:

    private JedisPoolConfig buildPoolConfig() {
        final JedisPoolConfig poolConfig = new JedisPoolConfig();
        poolConfig.setMaxTotal(400);
        poolConfig.setMaxIdle(400);
        poolConfig.setMinIdle(200);
        poolConfig.setBlockWhenExhausted(true);
        poolConfig.setTestOnBorrow(true);
        poolConfig.setTestOnReturn(true);
        poolConfig.setTestWhileIdle(true);
        poolConfig.setNumTestsPerEvictionRun(3);
        return poolConfig;
    }

These are now seen in the logs:

2021-12-01 10:52:03,237 WARN redis.clients.jedis.JedisFactory [commons-pool-evictor] Error while QUIT
redis.clients.jedis.exceptions.JedisConnectionException: Unexpected end of stream.
        at redis.clients.jedis.util.RedisInputStream.ensureFill(RedisInputStream.java:202)
        at redis.clients.jedis.util.RedisInputStream.readByte(RedisInputStream.java:43)
        at redis.clients.jedis.Protocol.process(Protocol.java:162)
        at redis.clients.jedis.Protocol.read(Protocol.java:227)
        at redis.clients.jedis.Connection.readProtocolWithCheckingBroken(Connection.java:352)
        at redis.clients.jedis.Connection.getStatusCodeReply(Connection.java:270)
        at redis.clients.jedis.BinaryJedis.quit(BinaryJedis.java:471)
        at redis.clients.jedis.JedisFactory.destroyObject(JedisFactory.java:160)
        at org.apache.commons.pool2.PooledObjectFactory.destroyObject(PooledObjectFactory.java:127)
        at org.apache.commons.pool2.impl.GenericObjectPool.destroy(GenericObjectPool.java:611)
        at org.apache.commons.pool2.impl.GenericObjectPool.evict(GenericObjectPool.java:729)
        at org.apache.commons.pool2.impl.BaseGenericObjectPool$Evictor.run(BaseGenericObjectPool.java:160)
        at org.apache.commons.pool2.impl.EvictionTimer$WeakRunner.run(EvictionTimer.java:113)
        at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
        at java.base/java.util.concurrent.FutureTask.runAndReset(Unknown Source)
        at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
        at java.base/java.lang.Thread.run(Unknown Source)

I am also seeing these:

2021-12-01 12:24:02,017 WARN redis.clients.jedis.JedisFactory [commons-pool-evictor] Error while QUIT
redis.clients.jedis.exceptions.JedisConnectionException: java.net.SocketException: Broken pipe (Write failed)
    at redis.clients.jedis.Connection.flush(Connection.java:342)
    at redis.clients.jedis.Connection.getStatusCodeReply(Connection.java:269)
    at redis.clients.jedis.BinaryJedis.quit(BinaryJedis.java:471)
    at redis.clients.jedis.JedisFactory.destroyObject(JedisFactory.java:160)
    at org.apache.commons.pool2.PooledObjectFactory.destroyObject(PooledObjectFactory.java:127)
    at org.apache.commons.pool2.impl.GenericObjectPool.destroy(GenericObjectPool.java:611)
    at org.apache.commons.pool2.impl.GenericObjectPool.evict(GenericObjectPool.java:729)
    at org.apache.commons.pool2.impl.BaseGenericObjectPool$Evictor.run(BaseGenericObjectPool.java:160)
    at org.apache.commons.pool2.impl.EvictionTimer$WeakRunner.run(EvictionTimer.java:113)
    at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
    at java.base/java.util.concurrent.FutureTask.runAndReset(Unknown Source)
    at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
    at java.base/java.lang.Thread.run(Unknown Source)
Caused by: java.net.SocketException: Broken pipe (Write failed)
    at java.base/java.net.SocketOutputStream.socketWrite0(Native Method)
    at java.base/java.net.SocketOutputStream.socketWrite(Unknown Source)
    at java.base/java.net.SocketOutputStream.write(Unknown Source)
    at redis.clients.jedis.util.RedisOutputStream.flushBuffer(RedisOutputStream.java:52)
    at redis.clients.jedis.util.RedisOutputStream.flush(RedisOutputStream.java:133)
    at redis.clients.jedis.Connection.flush(Connection.java:339)
    ... 14 common frames omitted
yangbodong22011 commented 2 years ago

@steinrr try this


try (Jedis jedis = localJedisPool.getResource())  {
        boolean foundKey = false;
        int i = 0;

        String response = null;

        do {
            response = jedis.get(sessionid);
            if (response == null) {
                i++;
                logger.info("Failed REDIS key for session id: " + sessionid + ". Retry #" + i);
                try {
                    Thread.sleep(25);
                } catch (InterruptedException ie) {
                    Thread.currentThread().interrupt();
                }
            } else {
                foundKey = true;
                logger.info("Found REDIS key for session id: " + sessionid + " AFTER retry #" + i);
            }

        }
        while (foundKey == false && i<20);

    } catch (Exception e) {
        logger.error("Could not get REDIS instance for session id: " + sessionid);
    }

you don need to call localJedisPool.returnResource(jedis);, you need call jedis.close after query.

steinrr commented 2 years ago

@yangbodong22011

I already did rewrite to using close instead of return. BUT the issue is not this part of the code - it works as expected and does as I wish - but if you see the exception log I posted - the error happens in the eviction process.

yangbodong22011 commented 2 years ago

I already did rewrite to using close instead of return. BUT the issue is not this part of the code - it works as expected and does as I wish - but if you see the exception log I posted - the error happens in the eviction process.

The potential problem here is that you will return the Jedis of broken to the normal connection pool through localJedisPool.returnResource(jedis), and it should call localJedisPool.returnBrokenResource(jedis). Of course, if you call Jedis.close(), all this will be done automatically.

but if you see the exception log I posted

If the error occurs after you have used jedis.close(), then it is possible: when evict is running, before calling the quit command, the connection has been disconnected from redis, so you will also encounter Broken Pipeerror.

If this error may recur stably, maybe you can use wireshark or tcpdump to capture the packets and see what happened.

steinrr commented 2 years ago

Yes, the error happens after I have used jedis.close() (many minutes after). The app is run in an Azure app service, so it is not easy to debug the TCP traffic here unfortunately. Do you have any experiences with Azure app service/Azure Redis and how quickly it will disconnect clients if not used?

yangbodong22011 commented 2 years ago

Do you have any experiences with Azure app service/Azure Redis and how quickly it will disconnect clients if not used?

Unfortunately, I didn't.

zeekling commented 2 years ago

what value of timeout configration of redis-server ?

iamgd67 commented 2 years ago

Yes, the error happens after I have used jedis.close() (many minutes after). The app is run in an Azure app service, so it is not easy to debug the TCP traffic here unfortunately. Do you have any experiences with Azure app service/Azure Redis and how quickly it will disconnect clients if not used?

@yangbodong22011 is right, use close instead of returnResource will solve the eviction problem, because close will also set dataSource property of Jedis instance to null in addition to call returnResource. without set dataSource property, eviction will call close again after remove the instance from pool, witch cause the object not currently part of this pool exception.

v-barros commented 1 year ago

I solved this by calling pool.getResource() again before pool.destroy() or pool.close(). In my case, this is triggered only when the application stops. I don't know if this is your case, but it worked for me. I'm using jedis 3.6.3

sazzad16 commented 1 year ago

@steinrr solved the initial issue.

I think his second comment is happening because some Redis offerings based on plain open source Redis server do not support QUIT command. Such cases should be fixed by #3353 (#3377 for Jedis 3.x which is included in Jedis 3.10.0).

sazzad16 commented 1 month ago

Closing for now. Can be reopened if same issue happens again.