CodisLabs / jodis

A java client for codis based on Jedis and Curator
MIT License
216 stars 97 forks source link

后台任务跑一段时间报错redis.clients.jedis.exceptions.JedisConnectionException: java.net.SocketTimeoutException: Read timed out #31

Closed beyondqinghua closed 7 years ago

beyondqinghua commented 7 years ago

测试跑100W连续多线程塞数据都正常,但是停下放一段时间后,其他小批量的后台任务反而时不时报错: 2016-12-15 20:11:33.439 [Thread-990017] ERROR c.k.f.s.w.redis.codis.test.task.TestCodisRunnable - 990000-测试codis使用时间:8 2016-12-15 20:11:33.784 [Thread-991017] ERROR c.k.f.s.w.redis.codis.test.task.TestCodisRunnable - 991000-测试codis使用时间:8 2016-12-15 20:11:34.076 [Thread-992017] ERROR c.k.f.s.w.redis.codis.test.task.TestCodisRunnable - 992000-测试codis使用时间:3 2016-12-15 20:11:34.401 [Thread-993017] ERROR c.k.f.s.w.redis.codis.test.task.TestCodisRunnable - 993000-测试codis使用时间:7 2016-12-15 20:11:34.724 [Thread-994017] ERROR c.k.f.s.w.redis.codis.test.task.TestCodisRunnable - 994000-测试codis使用时间:9 2016-12-15 20:11:34.972 [Thread-995017] ERROR c.k.f.s.w.redis.codis.test.task.TestCodisRunnable - 995000-测试codis使用时间:13 2016-12-15 20:11:35.363 [Thread-996017] ERROR c.k.f.s.w.redis.codis.test.task.TestCodisRunnable - 996000-测试codis使用时间:5 2016-12-15 20:11:35.657 [Thread-997017] ERROR c.k.f.s.w.redis.codis.test.task.TestCodisRunnable - 997000-测试codis使用时间:8 2016-12-15 20:11:35.943 [Thread-998017] ERROR c.k.f.s.w.redis.codis.test.task.TestCodisRunnable - 998000-测试codis使用时间:2 2016-12-15 20:11:36.210 [Thread-999017] ERROR c.k.f.s.w.redis.codis.test.task.TestCodisRunnable - 999000-测试codis使用时间:1 2016-12-15 20:52:52.489 [localhost-startStop-1-SendThread(10.201.5.16:2180)] INFO org.apache.zookeeper.ClientCnxn - Client session timed out, have not heard from server in 10004ms for sessionid 0x158fd66fd860020, closing socket connection and attempting reconnect 2016-12-15 20:52:52.489 [localhost-startStop-1-SendThread(10.201.5.16:2180)] INFO org.apache.zookeeper.ClientCnxn - Client session timed out, have not heard from server in 13510ms for sessionid 0x158fd66fd86001f, closing socket connection and attempting reconnect 2016-12-15 20:52:53.988 [localhost-startStop-1-EventThread] INFO o.a.curator.framework.state.ConnectionStateManager - State change: SUSPENDED 2016-12-15 20:52:53.990 [localhost-startStop-1-EventThread] INFO o.a.curator.framework.state.ConnectionStateManager - State change: SUSPENDED 2016-12-15 20:52:54.407 [localhost-startStop-1-SendThread(10.201.5.17:2180)] INFO org.apache.zookeeper.ClientCnxn - Opening socket connection to server 10.201.5.17/10.201.5.17:2180. Will not attempt to authenticate using SASL (unknown error) 2016-12-15 20:52:54.515 [localhost-startStop-1-SendThread(10.201.5.17:2180)] INFO org.apache.zookeeper.ClientCnxn - Socket connection established to 10.201.5.17/10.201.5.17:2180, initiating session 2016-12-15 20:52:54.923 [localhost-startStop-1-SendThread(10.201.5.17:2180)] INFO org.apache.zookeeper.ClientCnxn - Opening socket connection to server 10.201.5.17/10.201.5.17:2180. Will not attempt to authenticate using SASL (unknown error) 2016-12-15 20:52:54.924 [localhost-startStop-1-SendThread(10.201.5.17:2180)] INFO org.apache.zookeeper.ClientCnxn - Socket connection established to 10.201.5.17/10.201.5.17:2180, initiating session 2016-12-15 20:52:55.625 [localhost-startStop-1-SendThread(10.201.5.17:2180)] INFO org.apache.zookeeper.ClientCnxn - Unable to reconnect to ZooKeeper service, session 0x158fd66fd860020 has expired, closing socket connection 2016-12-15 20:52:55.625 [localhost-startStop-1-SendThread(10.201.5.17:2180)] INFO org.apache.zookeeper.ClientCnxn - Unable to reconnect to ZooKeeper service, session 0x158fd66fd86001f has expired, closing socket connection 2016-12-15 20:52:55.742 [localhost-startStop-1-EventThread] WARN org.apache.curator.ConnectionState - Session expired event received 2016-12-15 20:52:55.823 [Curator-PathChildrenCache-0] INFO io.codis.jodis.RoundRobinJedisPool - Receive child event: type=CONNECTION_SUSPENDED, no data 2016-12-15 20:52:55.823 [Curator-PathChildrenCache-1] INFO io.codis.jodis.RoundRobinJedisPool - Receive child event: type=CONNECTION_SUSPENDED, no data 2016-12-15 20:52:55.824 [localhost-startStop-1-EventThread] INFO org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=10.201.5.16:2180,10.201.5.15:2180,10.201.5.17:2180 sessionTimeout=20000 watcher=org.apache.curator.ConnectionState@7ebe36cc 2016-12-15 20:52:55.825 [localhost-startStop-1-EventThread] WARN org.apache.curator.ConnectionState - Session expired event received 2016-12-15 20:52:55.825 [localhost-startStop-1-EventThread] INFO org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=10.201.5.16:2180,10.201.5.15:2180,10.201.5.17:2180 sessionTimeout=15000 watcher=org.apache.curator.ConnectionState@500764ed 2016-12-15 20:52:55.826 [localhost-startStop-1-EventThread] INFO o.a.curator.framework.state.ConnectionStateManager - State change: LOST 2016-12-15 20:52:55.826 [Curator-PathChildrenCache-0] INFO io.codis.jodis.RoundRobinJedisPool - Receive child event: type=CONNECTION_LOST, no data 2016-12-15 20:52:55.826 [localhost-startStop-1-EventThread] INFO org.apache.zookeeper.ClientCnxn - EventThread shut down 2016-12-15 20:52:55.827 [localhost-startStop-1-SendThread(10.201.5.15:2180)] INFO org.apache.zookeeper.ClientCnxn - Opening socket connection to server 10.201.5.15/10.201.5.15:2180. Will not attempt to authenticate using SASL (unknown error) 2016-12-15 20:52:55.827 [localhost-startStop-1-SendThread(10.201.5.17:2180)] INFO org.apache.zookeeper.ClientCnxn - Opening socket connection to server 10.201.5.17/10.201.5.17:2180. Will not attempt to authenticate using SASL (unknown error) 2016-12-15 20:52:55.828 [localhost-startStop-1-EventThread] INFO o.a.curator.framework.state.ConnectionStateManager - State change: LOST 2016-12-15 20:52:55.828 [localhost-startStop-1-EventThread] INFO org.apache.zookeeper.ClientCnxn - EventThread shut down 2016-12-15 20:52:55.829 [localhost-startStop-1-SendThread(10.201.5.15:2180)] INFO org.apache.zookeeper.ClientCnxn - Socket connection established to 10.201.5.15/10.201.5.15:2180, initiating session 2016-12-15 20:52:55.829 [Curator-PathChildrenCache-1] INFO io.codis.jodis.RoundRobinJedisPool - Receive child event: type=CONNECTION_LOST, no data 2016-12-15 20:52:55.829 [localhost-startStop-1-SendThread(10.201.5.17:2180)] INFO org.apache.zookeeper.ClientCnxn - Socket connection established to 10.201.5.17/10.201.5.17:2180, initiating session 2016-12-15 20:52:55.832 [localhost-startStop-1-SendThread(10.201.5.17:2180)] INFO org.apache.zookeeper.ClientCnxn - Session establishment complete on server 10.201.5.17/10.201.5.17:2180, sessionid = 0x258faae399a0029, negotiated timeout = 15000 2016-12-15 20:52:55.832 [localhost-startStop-1-EventThread] INFO o.a.curator.framework.state.ConnectionStateManager - State change: RECONNECTED 2016-12-15 20:52:55.833 [Curator-PathChildrenCache-1] INFO io.codis.jodis.RoundRobinJedisPool - Receive child event: type=CONNECTION_RECONNECTED, no data 2016-12-15 20:52:55.834 [localhost-startStop-1-SendThread(10.201.5.15:2180)] INFO org.apache.zookeeper.ClientCnxn - Session establishment complete on server 10.201.5.15/10.201.5.15:2180, sessionid = 0x58faae39ef0026, negotiated timeout = 20000 2016-12-15 20:52:55.834 [localhost-startStop-1-EventThread] INFO o.a.curator.framework.state.ConnectionStateManager - State change: RECONNECTED 2016-12-15 20:52:55.835 [Curator-PathChildrenCache-0] INFO io.codis.jodis.RoundRobinJedisPool - Receive child event: type=CONNECTION_RECONNECTED, no data 2016-12-15 20:53:38.374 [Timer-6] ERROR c.k.finance.redis.service.RedisCodisServiceImpl - leftPopList 出错[key=MQ:600000-system_rank_msg-500000-test] redis.clients.jedis.exceptions.JedisConnectionException: java.net.SocketTimeoutException: Read timed out at redis.clients.util.RedisInputStream.ensureFill(RedisInputStream.java:201) ~[jedis-2.8.0.jar:na] at redis.clients.util.RedisInputStream.readByte(RedisInputStream.java:40) ~[jedis-2.8.0.jar:na] at redis.clients.jedis.Protocol.process(Protocol.java:141) ~[jedis-2.8.0.jar:na] at redis.clients.jedis.Protocol.read(Protocol.java:205) ~[jedis-2.8.0.jar:na] at redis.clients.jedis.Connection.readProtocolWithCheckingBroken(Connection.java:297) ~[jedis-2.8.0.jar:na] at redis.clients.jedis.Connection.getBinaryBulkReply(Connection.java:216) ~[jedis-2.8.0.jar:na] at redis.clients.jedis.BinaryJedis.lpop(BinaryJedis.java:1153) ~[jedis-2.8.0.jar:na] at com.kingdee.finance.redis.service.RedisCodisServiceImpl.leftPopList(RedisCodisServiceImpl.java:986) ~[finance-lib-redis-1.0.4-SNAPSHOT.jar:1.0.4-SNAPSHOT] at com.kingdee.finance.jms.redismq.queue.manage.MessageTransactionManage.consumeMessage(MessageTransactionManage.java:63) [finance-lib-redisjms-1.0.4-SNAPSHOT.jar:1.0.4-SNAPSHOT] at com.kingdee.finance.jms.redismq.queue.consumer.subscriber.RedisMQQueueSubscriber.subscriber(RedisMQQueueSubscriber.java:56) [finance-lib-redisjms-1.0.4-SNAPSHOT.jar:1.0.4-SNAPSHOT] at com.kingdee.finance.jms.redismq.queue.consumer.subscriber.QueueSubscriberTimerTask.run(QueueSubscriberTimerTask.java:22) [finance-lib-redisjms-1.0.4-SNAPSHOT.jar:1.0.4-SNAPSHOT] at java.util.TimerThread.mainLoop(Timer.java:555) [na:1.7.0_79] at java.util.TimerThread.run(Timer.java:505) [na:1.7.0_79] Caused by: java.net.SocketTimeoutException: Read timed out at java.net.SocketInputStream.socketRead0(Native Method) ~[na:1.7.0_79] at java.net.SocketInputStream.read(SocketInputStream.java:152) ~[na:1.7.0_79] at java.net.SocketInputStream.read(SocketInputStream.java:122) ~[na:1.7.0_79] at java.net.SocketInputStream.read(SocketInputStream.java:108) ~[na:1.7.0_79] at redis.clients.util.RedisInputStream.ensureFill(RedisInputStream.java:195) ~[jedis-2.8.0.jar:na] ... 12 common frames omitted 2016-12-15 20:56:11.162 [Timer-6] ERROR c.k.finance.redis.service.RedisCodisServiceImpl - leftPopList 出错[key=MQ:600000-system_rank_msg-500000-test] redis.clients.jedis.exceptions.JedisConnectionException: java.net.SocketTimeoutException: Read timed out at redis.clients.util.RedisInputStream.ensureFill(RedisInputStream.java:201) ~[jedis-2.8.0.jar:na] at redis.clients.util.RedisInputStream.readByte(RedisInputStream.java:40) ~[jedis-2.8.0.jar:na] at redis.clients.jedis.Protocol.process(Protocol.java:141) ~[jedis-2.8.0.jar:na] at redis.clients.jedis.Protocol.read(Protocol.java:205) ~[jedis-2.8.0.jar:na] at redis.clients.jedis.Connection.readProtocolWithCheckingBroken(Connection.java:297) ~[jedis-2.8.0.jar:na] at redis.clients.jedis.Connection.getBinaryBulkReply(Connection.java:216) ~[jedis-2.8.0.jar:na] at redis.clients.jedis.BinaryJedis.lpop(BinaryJedis.java:1153) ~[jedis-2.8.0.jar:na] at com.kingdee.finance.redis.service.RedisCodisServiceImpl.leftPopList(RedisCodisServiceImpl.java:986) ~[finance-lib-redis-1.0.4-SNAPSHOT.jar:1.0.4-SNAPSHOT] at com.kingdee.finance.jms.redismq.queue.manage.MessageTransactionManage.consumeMessage(MessageTransactionManage.java:63) [finance-lib-redisjms-1.0.4-SNAPSHOT.jar:1.0.4-SNAPSHOT] at com.kingdee.finance.jms.redismq.queue.consumer.subscriber.RedisMQQueueSubscriber.subscriber(RedisMQQueueSubscriber.java:56) [finance-lib-redisjms-1.0.4-SNAPSHOT.jar:1.0.4-SNAPSHOT] at com.kingdee.finance.jms.redismq.queue.consumer.subscriber.QueueSubscriberTimerTask.run(QueueSubscriberTimerTask.java:22) [finance-lib-redisjms-1.0.4-SNAPSHOT.jar:1.0.4-SNAPSHOT] at java.util.TimerThread.mainLoop(Timer.java:555) [na:1.7.0_79] at java.util.TimerThread.run(Timer.java:505) [na:1.7.0_79] Caused by: java.net.SocketTimeoutException: Read timed out at java.net.SocketInputStream.socketRead0(Native Method) ~[na:1.7.0_79] at java.net.SocketInputStream.read(SocketInputStream.java:152) ~[na:1.7.0_79] at java.net.SocketInputStream.read(SocketInputStream.java:122) ~[na:1.7.0_79] at java.net.SocketInputStream.read(SocketInputStream.java:108) ~[na:1.7.0_79] at redis.clients.util.RedisInputStream.ensureFill(RedisInputStream.java:195) ~[jedis-2.8.0.jar:na] ... 12 common frames omitted

代码如下: /**

参数: session_tms:20000 sleepms_retry:5000

pool_max_total:2000 pool_max_idle:100 pool_min_idle:50 pool_timeout_ms:6000

beyondqinghua commented 7 years ago
@Override
public String getString(String key) {
    if (StringUtils.isBlank(key)) {
        return null;
    }
    Jedis jedis = null;
    try {
        jedis = redisCodisFactory.getJedis();
        if (jedis == null) {
            return null;
        }
        return jedis.get(getKey(key));
    } catch (Throwable ex) {
        logger.error("getString error.key=" + key, ex);
    } finally {
        returnResource(jedis);
    }
    return null;
}

@Override
public void setString(String key, String value) {

    if (StringUtils.isBlank(key) || value == null) {
        return;
    }

    Jedis jedis = null;
    try {
        jedis = redisCodisFactory.getJedis();
        if (jedis == null) {
            return;
        }
        jedis.set(getKey(key), value);
    } catch (Throwable ex) {
        logger.error("setString error.key=" + key, ex);
    } finally {
        returnResource(jedis);
    }
}

private void returnResource(Jedis jedis) {
    try {
        if (jedis == null) {
            return;
        }
        jedis.close();
    } catch (Throwable e) {
        logger.error("returnResource error.", e);
    }
}

Jedis code: @Override public void close() { if (dataSource != null) { if (client.isBroken()) { this.dataSource.returnBrokenResource(this); } else { this.dataSource.returnResource(this); } } else { client.close(); } }

Apache9 commented 7 years ago
2016-12-15 20:52:52.489 [localhost-startStop-1-SendThread(10.201.5.16:2180)] INFO org.apache.zookeeper.ClientCnxn - Client session timed out, have not heard from server in 10004ms for sessionid 0x158fd66fd860020, closing socket connection and attempting reconnect
2016-12-15 20:52:52.489 [localhost-startStop-1-SendThread(10.201.5.16:2180)] INFO org.apache.zookeeper.ClientCnxn - Client session timed out, have not heard from server in 13510ms for sessionid 0x158fd66fd86001f, closing socket connection and attempting reconnect

这个地方zookeeper连接都超时了,你检查一下机器的负载以及GC的日志?看是不是程序自己卡了一下。另外为什么会有两个zookeeper的session?你用了两个pool?

beyondqinghua commented 7 years ago

@Apache9 是的,跑了两个不通类型的codis集群,所有有构建了两个不通的pool 估计跟ZK关系不大,我昨天晚上跑了一晚上,中间也出了不少这样的异常,这是日志: 2016-12-16 02:10:15.343 [pool-27-thread-5] INFO c.k.f.s.s.r.log.jms.consumer.RestfulSrvLogConsumer - 接收到restful服务日志bizcode=501011,sign=null,clientid=null,serverno=null 2016-12-16 02:10:17.922 [pool-27-thread-15] INFO c.k.f.s.s.r.log.jms.consumer.RestfulSrvLogConsumer - 接收到restful服务日志bizcode=501005,sign=null,clientid=null,serverno=null 2016-12-16 02:10:19.925 [Timer-22] ERROR c.k.finance.redis.service.RedisCodisServiceImpl - leftPopList 出错[key=MQ:90000-account_srv_log-600000-test] redis.clients.jedis.exceptions.JedisConnectionException: java.net.SocketTimeoutException: Read timed out at redis.clients.util.RedisInputStream.ensureFill(RedisInputStream.java:201) ~[jedis-2.8.0.jar:na] at redis.clients.util.RedisInputStream.readByte(RedisInputStream.java:40) ~[jedis-2.8.0.jar:na] at redis.clients.jedis.Protocol.process(Protocol.java:141) ~[jedis-2.8.0.jar:na] at redis.clients.jedis.Protocol.read(Protocol.java:205) ~[jedis-2.8.0.jar:na] at redis.clients.jedis.Connection.readProtocolWithCheckingBroken(Connection.java:297) ~[jedis-2.8.0.jar:na] at redis.clients.jedis.Connection.getBinaryBulkReply(Connection.java:216) ~[jedis-2.8.0.jar:na] at redis.clients.jedis.BinaryJedis.lpop(BinaryJedis.java:1153) ~[jedis-2.8.0.jar:na] at com.kingdee.finance.redis.service.RedisCodisServiceImpl.leftPopList(RedisCodisServiceImpl.java:986) ~[finance-lib-redis-1.0.4-SNAPSHOT.jar:1.0.4-SNAPSHOT] at com.kingdee.finance.jms.redismq.queue.manage.MessageTransactionManage.consumeMessage(MessageTransactionManage.java:63) [finance-lib-redisjms-1.0.4-SNAPSHOT.jar:1.0.4-SNAPSHOT] at com.kingdee.finance.jms.redismq.queue.consumer.subscriber.RedisMQQueueSubscriber.subscriber(RedisMQQueueSubscriber.java:56) [finance-lib-redisjms-1.0.4-SNAPSHOT.jar:1.0.4-SNAPSHOT] at com.kingdee.finance.jms.redismq.queue.consumer.subscriber.QueueSubscriberTimerTask.run(QueueSubscriberTimerTask.java:22) [finance-lib-redisjms-1.0.4-SNAPSHOT.jar:1.0.4-SNAPSHOT] at java.util.TimerThread.mainLoop(Timer.java:555) [na:1.7.0_79] at java.util.TimerThread.run(Timer.java:505) [na:1.7.0_79] Caused by: java.net.SocketTimeoutException: Read timed out at java.net.SocketInputStream.socketRead0(Native Method) ~[na:1.7.0_79] at java.net.SocketInputStream.read(SocketInputStream.java:152) ~[na:1.7.0_79] at java.net.SocketInputStream.read(SocketInputStream.java:122) ~[na:1.7.0_79] at java.net.SocketInputStream.read(SocketInputStream.java:108) ~[na:1.7.0_79] at redis.clients.util.RedisInputStream.ensureFill(RedisInputStream.java:195) ~[jedis-2.8.0.jar:na] ... 12 common frames omitted 2016-12-16 02:10:20.711 [pool-27-thread-15] ERROR c.k.finance.redis.service.RedisCodisServiceImpl - del error.key=MQAK:500000-restful_srv_log-600000-b4294feaec1b411e861d90cd7b4b0347-test redis.clients.jedis.exceptions.JedisConnectionException: java.net.SocketTimeoutException: Read timed out at redis.clients.util.RedisInputStream.ensureFill(RedisInputStream.java:201) ~[jedis-2.8.0.jar:na] at redis.clients.util.RedisInputStream.readByte(RedisInputStream.java:40) ~[jedis-2.8.0.jar:na] at redis.clients.jedis.Protocol.process(Protocol.java:141) ~[jedis-2.8.0.jar:na] at redis.clients.jedis.Protocol.read(Protocol.java:205) ~[jedis-2.8.0.jar:na] at redis.clients.jedis.Connection.readProtocolWithCheckingBroken(Connection.java:297) ~[jedis-2.8.0.jar:na] at redis.clients.jedis.Connection.getIntegerReply(Connection.java:222) ~[jedis-2.8.0.jar:na] at redis.clients.jedis.BinaryJedis.del(BinaryJedis.java:207) ~[jedis-2.8.0.jar:na] at com.kingdee.finance.redis.service.RedisCodisServiceImpl.remove(RedisCodisServiceImpl.java:279) ~[finance-lib-redis-1.0.4-SNAPSHOT.jar:1.0.4-SNAPSHOT] at com.kingdee.finance.jms.redismq.queue.manage.MessageTransactionManage.markConsumeResult(MessageTransactionManage.java:114) [finance-lib-redisjms-1.0.4-SNAPSHOT.jar:1.0.4-SNAPSHOT] at com.kingdee.finance.jms.redismq.queue.consumer.AbstractQueueMessageConsumer.setMessageReceiveSuccess(AbstractQueueMessageConsumer.java:37) [finance-lib-redisjms-1.0.4-SNAPSHOT.jar:1.0.4-SNAPSHOT] at com.kingdee.finance.jms.redismq.queue.consumer.AbstractMapQueueMessageConsumer.onMessage(AbstractMapQueueMessageConsumer.java:35) [finance-lib-redisjms-1.0.4-SNAPSHOT.jar:1.0.4-SNAPSHOT] at com.kingdee.finance.jms.redismq.queue.consumer.subscriber.QueueSubscriberRunnable.run(QueueSubscriberRunnable.java:25) [finance-lib-redisjms-1.0.4-SNAPSHOT.jar:1.0.4-SNAPSHOT] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_79] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_79] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_79] Caused by: java.net.SocketTimeoutException: Read timed out at java.net.SocketInputStream.socketRead0(Native Method) ~[na:1.7.0_79] at java.net.SocketInputStream.read(SocketInputStream.java:152) ~[na:1.7.0_79] at java.net.SocketInputStream.read(SocketInputStream.java:122) ~[na:1.7.0_79] at java.net.SocketInputStream.read(SocketInputStream.java:108) ~[na:1.7.0_79] at redis.clients.util.RedisInputStream.ensureFill(RedisInputStream.java:195) ~[jedis-2.8.0.jar:na] ... 14 common frames omitted 2016-12-16 02:10:21.400 [pool-27-thread-13] INFO c.k.f.s.s.r.log.jms.consumer.RestfulSrvLogConsumer - 接收到restful服务日志bizcode=501023,sign=null,clientid=null,serverno=null 2016-12-16 02:10:21.414 [pool-27-thread-14] INFO c.k.f.s.s.r.log.jms.consumer.RestfulSrvLogConsumer - 接收到restful服务日志bizcode=501029,sign=null,clientid=null,serverno=null 2016-12-16 02:10:21.416 [pool-27-thread-7] INFO c.k.f.s.s.r.log.jms.consumer.RestfulSrvLogConsumer - 接收到restful服务日志bizcode=501025,sign=null,clientid=null,serverno=null

beyondqinghua commented 7 years ago

@Apache9 我在想会不会是codis对客户端的socket长连接有超时控制,客户端的socket会话比codis的更长导致连接池的长连接已经被服务端强制关闭了导致?

Apache9 commented 7 years ago

codis应该会关长时间不动的连接,但是时间很长吧,30分钟? @spinlock

以及如果是codis proxy关了连接,你在往出写指令的时候应该就connection reset了,不会是read timeout吧,比较奇怪

beyondqinghua commented 7 years ago

@Apache9 @spinlock @yangzhe1991 我们用的是3.1版本,这个是codis 的参数配置:

##################################################
#                                                #
#                  Codis-Proxy                   #
#                                                #
##################################################

# Set Codis Product Name/Auth.
product_name = "JR_SDTEST_codis"
product_auth = "JR_SDTEST_codis_pwd"

# Set bind address for admin(rpc), tcp only.
admin_addr = "0.0.0.0:11080"

# Set bind address for proxy, proto_type can be "tcp", "tcp4", "tcp6", "unix" or "unixpacket".
proto_type = "tcp4"
proxy_addr = "0.0.0.0:19000"

# Set jodis address & session timeout, only accept "zookeeper" & "etcd".
jodis_name = "zookeeper"
jodis_addr = "172.22.23.238:2180,172.22.23.234:2180,172.22.23.210:2180"
jodis_timeout = "20s"
jodis_compatible = true

# Set datacenter of proxy.
proxy_datacenter = ""

# Set max number of alive sessions.
proxy_max_clients = 1000

# Set max offheap memory size. (0 to disable)
proxy_max_offheap_size = "1024mb"

# Set heap placeholder to reduce GC frequency.
proxy_heap_placeholder = "256mb"

# Proxy will ping backend redis in a predefined interval. (0 to disable)
backend_ping_period = "5s"

# Set backend recv buffer size & timeout.
backend_recv_bufsize = "128kb"
backend_recv_timeout = "30s"

# Set backend send buffer & timeout.
backend_send_bufsize = "128kb"
backend_send_timeout = "30s"

# Set backend pipeline buffer size.
backend_max_pipeline = 1024

# Set backend never read replica groups, default is false
backend_primary_only = false

# Set backend parallel connections per server
backend_primary_parallel = 1
backend_replica_parallel = 1

# Set backend tcp keepalive period. (0 to disable)
backend_keepalive_period = "75s"

# If there is no request from client for a long time, the connection will be closed. (0 to disable)
# Set session recv buffer size & timeout.
session_recv_bufsize = "128kb"
session_recv_timeout = "30m"

# Set session send buffer size & timeout.
session_send_bufsize = "64kb"
session_send_timeout = "30s"

# Make sure this is higher than the max number of requests for each pipeline request, or your client may be blocked.
# Set session pipeline buffer size.
session_max_pipeline = 512

# Set session tcp keepalive period. (0 to disable)
session_keepalive_period = "75s"

# Set metrics server (such as http://localhost:28000), proxy will report json formatted metrics to specified server in a predefined period.
metrics_report_server = ""
metrics_report_period = "1s"

# Set influxdb server (such as http://localhost:8086), proxy will report metrics to influxdb.
metrics_report_influxdb_server = ""
metrics_report_influxdb_period = "1s"
metrics_report_influxdb_username = ""
metrics_report_influxdb_password = ""
metrics_report_influxdb_database = ""
beyondqinghua commented 7 years ago

proxy 日志:

2016/12/16 09:36:36 backend.go:195: [WARN] backend conn [0xc4319b55e0] to 10.201.5.16:5000, writer-[19] exit
[error]: backend conn failure, set tcp 10.201.5.16:20754: use of closed network connection
2016/12/16 09:36:36 backend.go:160: [WARN] backend conn [0xc4319b55e0] to 10.201.5.16:5000, reader-[19] exit
[error]: backend conn failure, read tcp 10.201.5.16:20754->10.201.5.16:5000: i/o timeout
2016/12/16 09:36:36 backend.go:146: [WARN] backend conn [0xc4319b55e0] to 10.201.5.16:5000, rounds-[20]
2016/12/16 09:42:52 backend.go:195: [WARN] backend conn [0xc4319b55e0] to 10.201.5.16:5000, writer-[20] exit
[error]: backend conn failure, set tcp 10.201.5.16:31094: use of closed network connection
2016/12/16 09:42:55 backend.go:160: [WARN] backend conn [0xc4319b55e0] to 10.201.5.16:5000, reader-[20] exit
[error]: backend conn failure, read tcp 10.201.5.16:31094->10.201.5.16:5000: i/o timeout
2016/12/16 09:42:56 backend.go:146: [WARN] backend conn [0xc4319b55e0] to 10.201.5.16:5000, rounds-[21]
2016/12/16 09:50:52 backend.go:195: [WARN] backend conn [0xc4319b55e0] to 10.201.5.16:5000, writer-[21] exit
[error]: backend conn failure, set tcp 10.201.5.16:55634: use of closed network connection
2016/12/16 09:50:52 backend.go:160: [WARN] backend conn [0xc4319b55e0] to 10.201.5.16:5000, reader-[21] exit
[error]: backend conn failure, read tcp 10.201.5.16:55634->10.201.5.16:5000: i/o timeout
2016/12/16 09:50:52 backend.go:146: [WARN] backend conn [0xc4319b55e0] to 10.201.5.16:5000, rounds-[22]
spinlock commented 7 years ago

@Apache9 对的。 client --1--> proxy --2--> redis --3--> proxy --4--> client

这四个时间的超时,在 proxy 的配置是 30m, 30s, 30s, 30s,client 收到 timeout 而不是 EOF 应该是 jedis timeout 配置的过小导致的。

从 proxy 的 backend log 来看,出错都是 reader 的 i/o timeout 也就是都是过程 3 的 timeout,所以应该是那个时间点,redis 发生了什么导致 30s 内请求回不来。

从和 @beyondqinghua 沟通来看,jedis 配置的 6s timeout 明显是 < 30s 超时的。所以问题我估计在指令或者网络,或者那个时间点 redis 被做了什么操作卡住了。结果 jedis 先超时,之后 codis 等到 30s 然后也超时,这中间的所有指令也一样都会超时,直到重建连接成功。

beyondqinghua commented 7 years ago

@Apache9 @spinlock 这个是redis日志 17488:M 15 Dec 01:50:21.952 Starting automatic rewriting of AOF on 95771% growth 17488:M 15 Dec 01:50:21.955 Background append only file rewriting started by pid 14309 17488:M 15 Dec 01:50:21.988 AOF rewrite child asks to stop sending diffs. 14309:C 15 Dec 01:50:21.988 Parent agreed to stop sending diffs. Finalizing AOF... 14309:C 15 Dec 01:50:21.988 Concatenating 0.00 MB of AOF diff received from parent. 14309:C 15 Dec 01:50:21.988 SYNC append only file rewrite performed 14309:C 15 Dec 01:50:21.989 AOF rewrite: 0 MB of memory used by copy-on-write 17488:M 15 Dec 01:50:22.055 Background AOF rewrite terminated with success 17488:M 15 Dec 01:50:22.055 Residual parent diff successfully flushed to the rewritten AOF (0.00 MB) 17488:M 15 Dec 01:50:22.055 Background AOF rewrite finished successfully 17488:M 15 Dec 04:33:59.826 Starting automatic rewriting of AOF on 144338% growth 17488:M 15 Dec 04:33:59.859 Background append only file rewriting started by pid 17649 17488:M 15 Dec 04:33:59.915 AOF rewrite child asks to stop sending diffs. 17649:C 15 Dec 04:33:59.916 Parent agreed to stop sending diffs. Finalizing AOF... 17649:C 15 Dec 04:33:59.916 Concatenating 0.01 MB of AOF diff received from parent. 17649:C 15 Dec 04:33:59.918 SYNC append only file rewrite performed 17649:C 15 Dec 04:33:59.918 AOF rewrite: 1 MB of memory used by copy-on-write 17488:M 15 Dec 04:33:59.959 Background AOF rewrite terminated with success 17488:M 15 Dec 04:33:59.959 Residual parent diff successfully flushed to the rewritten AOF (0.00 MB) 17488:M 15 Dec 04:33:59.959 Background AOF rewrite finished successfully 17488:M 15 Dec 07:18:56.443 Starting automatic rewriting of AOF on 122272% growth 17488:M 15 Dec 07:18:56.446 Background append only file rewriting started by pid 26855 17488:M 15 Dec 07:18:56.487 AOF rewrite child asks to stop sending diffs. 26855:C 15 Dec 07:18:56.487 Parent agreed to stop sending diffs. Finalizing AOF... 26855:C 15 Dec 07:18:56.487 Concatenating 0.00 MB of AOF diff received from parent. 26855:C 15 Dec 07:18:56.489 SYNC append only file rewrite performed 26855:C 15 Dec 07:18:56.489 AOF rewrite: 0 MB of memory used by copy-on-write 17488:M 15 Dec 07:18:56.546 Background AOF rewrite terminated with success 17488:M 15 Dec 07:18:56.546 Residual parent diff successfully flushed to the rewritten AOF (0.00 MB) 17488:M 15 Dec 07:18:56.546 Background AOF rewrite finished successfully 17488:M 15 Dec 09:35:39.172 Asynchronous AOF fsync is taking too long (disk is busy?). Writing the AOF buffer without waiting for fsync to complete, this may slow down Redis. 17488:M 15 Dec 10:03:44.126 Starting automatic rewriting of AOF on 131409% growth 17488:M 15 Dec 10:03:44.129 Background append only file rewriting started by pid 14728 17488:M 15 Dec 10:03:44.169 AOF rewrite child asks to stop sending diffs. 14728:C 15 Dec 10:03:44.169 Parent agreed to stop sending diffs. Finalizing AOF... 14728:C 15 Dec 10:03:44.169 Concatenating 0.00 MB of AOF diff received from parent. 14728:C 15 Dec 10:03:44.171 SYNC append only file rewrite performed 14728:C 15 Dec 10:03:44.171 AOF rewrite: 0 MB of memory used by copy-on-write 17488:M 15 Dec 10:03:44.229 Background AOF rewrite terminated with success 17488:M 15 Dec 10:03:44.230 Residual parent diff successfully flushed to the rewritten AOF (0.00 MB) 17488:M 15 Dec 10:03:44.230 Background AOF rewrite finished successfully 17488:M 15 Dec 11:29:09.136 Starting automatic rewriting of AOF on 130311% growth 17488:M 15 Dec 11:29:09.139 Background append only file rewriting started by pid 4430 17488:M 15 Dec 11:29:09.176 AOF rewrite child asks to stop sending diffs. 4430:C 15 Dec 11:29:09.176 Parent agreed to stop sending diffs. Finalizing AOF... 4430:C 15 Dec 11:29:09.176 Concatenating 0.00 MB of AOF diff received from parent. 4430:C 15 Dec 11:29:09.178 SYNC append only file rewrite performed 4430:C 15 Dec 11:29:09.179 AOF rewrite: 1 MB of memory used by copy-on-write 17488:M 15 Dec 11:29:09.239 Background AOF rewrite terminated with success 17488:M 15 Dec 11:29:09.239 Residual parent diff successfully flushed to the rewritten AOF (0.00 MB) 17488:M 15 Dec 11:29:09.239 Background AOF rewrite finished successfully 17488:M 15 Dec 12:54:31.565 Starting automatic rewriting of AOF on 137443% growth 17488:M 15 Dec 12:54:31.568 Background append only file rewriting started by pid 9150 17488:M 15 Dec 12:54:31.607 AOF rewrite child asks to stop sending diffs. 9150:C 15 Dec 12:54:31.607 Parent agreed to stop sending diffs. Finalizing AOF... 9150:C 15 Dec 12:54:31.607 Concatenating 0.00 MB of AOF diff received from parent. 9150:C 15 Dec 12:54:31.608 SYNC append only file rewrite performed 9150:C 15 Dec 12:54:31.609 AOF rewrite: 0 MB of memory used by copy-on-write 17488:M 15 Dec 12:54:31.668 Background AOF rewrite terminated with success 17488:M 15 Dec 12:54:31.668 Residual parent diff successfully flushed to the rewritten AOF (0.00 MB) 17488:M 15 Dec 12:54:31.669 Background AOF rewrite finished successfully 17488:M 15 Dec 14:20:44.394 Starting automatic rewriting of AOF on 132372% growth 17488:M 15 Dec 14:20:44.395 Background append only file rewriting started by pid 30551 17488:M 15 Dec 14:20:44.419 AOF rewrite child asks to stop sending diffs. 30551:C 15 Dec 14:20:44.420 Parent agreed to stop sending diffs. Finalizing AOF... 30551:C 15 Dec 14:20:44.420 Concatenating 0.00 MB of AOF diff received from parent. 30551:C 15 Dec 14:20:44.420 SYNC append only file rewrite performed 30551:C 15 Dec 14:20:44.420 AOF rewrite: 0 MB of memory used by copy-on-write 17488:M 15 Dec 14:20:44.495 Background AOF rewrite terminated with success 17488:M 15 Dec 14:20:44.495 Residual parent diff successfully flushed to the rewritten AOF (0.00 MB) 17488:M 15 Dec 14:20:44.495 Background AOF rewrite finished successfully 17488:M 15 Dec 15:43:58.148 Starting automatic rewriting of AOF on 141842% growth 17488:M 15 Dec 15:43:58.150 Background append only file rewriting started by pid 8472 17488:M 15 Dec 15:43:58.203 AOF rewrite child asks to stop sending diffs. 8472:C 15 Dec 15:43:58.203 Parent agreed to stop sending diffs. Finalizing AOF... 8472:C 15 Dec 15:43:58.203 Concatenating 0.00 MB of AOF diff received from parent. 8472:C 15 Dec 15:43:58.205 SYNC append only file rewrite performed 8472:C 15 Dec 15:43:58.206 AOF rewrite: 0 MB of memory used by copy-on-write 17488:M 15 Dec 15:43:58.251 Background AOF rewrite terminated with success 17488:M 15 Dec 15:43:58.251 Residual parent diff successfully flushed to the rewritten AOF (0.00 MB) 17488:M 15 Dec 15:43:58.251 Background AOF rewrite finished successfully 17488:M 15 Dec 15:57:45.342 Starting automatic rewriting of AOF on 4618% growth 17488:M 15 Dec 15:57:45.345 Background append only file rewriting started by pid 31629 17488:M 15 Dec 15:57:46.850 AOF rewrite child asks to stop sending diffs. 31629:C 15 Dec 15:57:46.850 Parent agreed to stop sending diffs. Finalizing AOF... 31629:C 15 Dec 15:57:46.850 Concatenating 0.67 MB of AOF diff received from parent. 31629:C 15 Dec 15:57:46.855 SYNC append only file rewrite performed 31629:C 15 Dec 15:57:46.857 AOF rewrite: 6 MB of memory used by copy-on-write 17488:M 15 Dec 15:57:46.945 Background AOF rewrite terminated with success 17488:M 15 Dec 15:57:46.945 Residual parent diff successfully flushed to the rewritten AOF (0.03 MB) 17488:M 15 Dec 15:57:46.945 Background AOF rewrite finished successfully 17488:M 15 Dec 15:59:47.707 Starting automatic rewriting of AOF on 100% growth 17488:M 15 Dec 15:59:47.717 Background append only file rewriting started by pid 19485 17488:M 15 Dec 15:59:50.547 AOF rewrite child asks to stop sending diffs. 19485:C 15 Dec 15:59:50.547 Parent agreed to stop sending diffs. Finalizing AOF... 19485:C 15 Dec 15:59:50.547 Concatenating 0.80 MB of AOF diff received from parent. 19485:C 15 Dec 15:59:50.562 SYNC append only file rewrite performed 19485:C 15 Dec 15:59:50.570 AOF rewrite: 15 MB of memory used by copy-on-write 17488:M 15 Dec 15:59:50.637 Background AOF rewrite terminated with success 17488:M 15 Dec 15:59:50.637 Residual parent diff successfully flushed to the rewritten AOF (0.04 MB) 17488:M 15 Dec 15:59:50.638 Background AOF rewrite finished successfully 17488:M 15 Dec 16:06:07.198 Starting automatic rewriting of AOF on 100% growth 17488:M 15 Dec 16:06:07.212 Background append only file rewriting started by pid 8243 17488:M 15 Dec 16:06:10.049 AOF rewrite child asks to stop sending diffs. 8243:C 15 Dec 16:06:10.049 Parent agreed to stop sending diffs. Finalizing AOF... 8243:C 15 Dec 16:06:10.050 Concatenating 0.91 MB of AOF diff received from parent. 8243:C 15 Dec 16:06:10.058 SYNC append only file rewrite performed 8243:C 15 Dec 16:06:10.072 AOF rewrite: 26 MB of memory used by copy-on-write 17488:M 15 Dec 16:06:10.100 Background AOF rewrite terminated with success 17488:M 15 Dec 16:06:10.100 Residual parent diff successfully flushed to the rewritten AOF (0.01 MB) 17488:M 15 Dec 16:06:10.101 Background AOF rewrite finished successfully 17488:M 15 Dec 16:06:51.088 Asynchronous AOF fsync is taking too long (disk is busy?). Writing the AOF buffer without waiting for fsync to complete, this may slow down Redis. 17488:M 15 Dec 16:10:12.352 Asynchronous AOF fsync is taking too long (disk is busy?). Writing the AOF buffer without waiting for fsync to complete, this may slow down Redis. 17488:M 15 Dec 16:14:19.200 Starting automatic rewriting of AOF on 100% growth 17488:M 15 Dec 16:14:19.216 Background append only file rewriting started by pid 2775 17488:M 15 Dec 16:14:29.839 AOF rewrite child asks to stop sending diffs. 2775:C 15 Dec 16:14:29.839 Parent agreed to stop sending diffs. Finalizing AOF... 2775:C 15 Dec 16:14:29.839 Concatenating 4.22 MB of AOF diff received from parent. 2775:C 15 Dec 16:14:29.857 SYNC append only file rewrite performed 2775:C 15 Dec 16:14:29.868 AOF rewrite: 162 MB of memory used by copy-on-write 17488:M 15 Dec 16:14:29.962 Background AOF rewrite terminated with success 17488:M 15 Dec 16:14:29.962 Residual parent diff successfully flushed to the rewritten AOF (0.00 MB) 17488:M 15 Dec 16:14:29.962 Background AOF rewrite finished successfully 17488:M 15 Dec 16:41:10.468 Asynchronous AOF fsync is taking too long (disk is busy?). Writing the AOF buffer without waiting for fsync to complete, this may slow down Redis. 17488:M 15 Dec 16:41:14.660 Asynchronous AOF fsync is taking too long (disk is busy?). Writing the AOF buffer without waiting for fsync to complete, this may slow down Redis. 17488:M 15 Dec 18:40:46.426 Asynchronous AOF fsync is taking too long (disk is busy?). Writing the AOF buffer without waiting for fsync to complete, this may slow down Redis. 17488:M 15 Dec 18:41:05.368 Asynchronous AOF fsync is taking too long (disk is busy?). Writing the AOF buffer without waiting for fsync to complete, this may slow down Redis. 17488:M 15 Dec 18:41:40.239 Asynchronous AOF fsync is taking too long (disk is busy?). Writing the AOF buffer without waiting for fsync to complete, this may slow down Redis. 17488:M 15 Dec 18:41:57.367 Starting automatic rewriting of AOF on 100% growth 17488:M 15 Dec 18:41:57.385 Background append only file rewriting started by pid 8910 17488:M 15 Dec 18:42:03.925 AOF rewrite child asks to stop sending diffs. 8910:C 15 Dec 18:42:03.925 Parent agreed to stop sending diffs. Finalizing AOF... 8910:C 15 Dec 18:42:03.926 Concatenating 1.12 MB of AOF diff received from parent. 8910:C 15 Dec 18:42:03.941 SYNC append only file rewrite performed 8910:C 15 Dec 18:42:03.953 AOF rewrite: 128 MB of memory used by copy-on-write 17488:M 15 Dec 18:42:04.025 Background AOF rewrite terminated with success 17488:M 15 Dec 18:42:04.026 Residual parent diff successfully flushed to the rewritten AOF (0.00 MB) 17488:M 15 Dec 18:42:04.026 Background AOF rewrite finished successfully 17488:M 15 Dec 18:58:10.879 Asynchronous AOF fsync is taking too long (disk is busy?). Writing the AOF buffer without waiting for fsync to complete, this may slow down Redis. 17488:M 15 Dec 19:58:05.439 Starting automatic rewriting of AOF on 100% growth 17488:M 15 Dec 19:58:05.459 Background append only file rewriting started by pid 8021 17488:M 15 Dec 19:58:11.505 AOF rewrite child asks to stop sending diffs. 8021:C 15 Dec 19:58:11.506 Parent agreed to stop sending diffs. Finalizing AOF... 8021:C 15 Dec 19:58:11.506 Concatenating 3.55 MB of AOF diff received from parent. 8021:C 15 Dec 19:58:11.527 SYNC append only file rewrite performed 8021:C 15 Dec 19:58:11.551 AOF rewrite: 118 MB of memory used by copy-on-write 17488:M 15 Dec 19:58:11.699 Background AOF rewrite terminated with success 17488:M 15 Dec 19:58:11.699 Residual parent diff successfully flushed to the rewritten AOF (0.10 MB) 17488:M 15 Dec 19:58:11.699 Background AOF rewrite finished successfully 17488:M 15 Dec 20:13:33.023 Starting automatic rewriting of AOF on 100% growth 17488:M 15 Dec 20:13:33.061 Background append only file rewriting started by pid 6978 17488:M 15 Dec 20:13:38.850 AOF rewrite child asks to stop sending diffs. 6978:C 15 Dec 20:13:38.852 Parent agreed to stop sending diffs. Finalizing AOF... 6978:C 15 Dec 20:13:38.852 Concatenating 3.69 MB of AOF diff received from parent. 6978:C 15 Dec 20:13:38.870 SYNC append only file rewrite performed 6978:C 15 Dec 20:13:38.891 AOF rewrite: 123 MB of memory used by copy-on-write 17488:M 15 Dec 20:13:38.993 Background AOF rewrite terminated with success 17488:M 15 Dec 20:13:38.993 Residual parent diff successfully flushed to the rewritten AOF (0.08 MB) 17488:M 15 Dec 20:13:38.993 Background AOF rewrite finished successfully 17488:M 15 Dec 20:51:09.228 Asynchronous AOF fsync is taking too long (disk is busy?). Writing the AOF buffer without waiting for fsync to complete, this may slow down Redis. 17488:M 15 Dec 21:05:09.143 Asynchronous AOF fsync is taking too long (disk is busy?). Writing the AOF buffer without waiting for fsync to complete, this may slow down Redis. 17488:M 15 Dec 21:05:16.433 Asynchronous AOF fsync is taking too long (disk is busy?). Writing the AOF buffer without waiting for fsync to complete, this may slow down Redis. 17488:M 16 Dec 02:10:20.140 Asynchronous AOF fsync is taking too long (disk is busy?). Writing the AOF buffer without waiting for fsync to complete, this may slow down Redis. 17488:M 16 Dec 03:04:11.199 Asynchronous AOF fsync is taking too long (disk is busy?). Writing the AOF buffer without waiting for fsync to complete, this may slow down Redis. 17488:M 16 Dec 05:20:00.710 Starting automatic rewriting of AOF on 100% growth 17488:M 16 Dec 05:20:00.731 Background append only file rewriting started by pid 18940 17488:M 16 Dec 05:20:04.494 AOF rewrite child asks to stop sending diffs. 18940:C 16 Dec 05:20:04.494 Parent agreed to stop sending diffs. Finalizing AOF... 18940:C 16 Dec 05:20:04.494 Concatenating 1.00 MB of AOF diff received from parent. 18940:C 16 Dec 05:20:04.500 SYNC append only file rewrite performed 18940:C 16 Dec 05:20:04.513 AOF rewrite: 86 MB of memory used by copy-on-write 17488:M 16 Dec 05:20:04.582 Background AOF rewrite terminated with success 17488:M 16 Dec 05:20:04.582 Residual parent diff successfully flushed to the rewritten AOF (0.01 MB) 17488:M 16 Dec 05:20:04.582 Background AOF rewrite finished successfully 17488:M 16 Dec 10:27:05.554 Asynchronous AOF fsync is taking too long (disk is busy?). Writing the AOF buffer without waiting for fsync to complete, this may slow down Redis. 17488:M 16 Dec 10:59:08.079 Asynchronous AOF fsync is taking too long (disk is busy?). Writing the AOF buffer without waiting for fsync to complete, this may slow down Redis. 17488:M 16 Dec 11:24:07.127 Asynchronous AOF fsync is taking too long (disk is busy?). Writing the AOF buffer without waiting for fsync to complete, this may slow down Redis. 17488:M 16 Dec 11:24:14.035 Asynchronous AOF fsync is taking too long (disk is busy?). Writing the AOF buffer without waiting for fsync to complete, this may slow down Redis.