PagerDuty / scheduler

A Scala library for scheduling arbitrary code to run at an arbitrary time.
BSD 3-Clause "New" or "Revised" License
214 stars 39 forks source link

PoolTimeoutException when renew Connection #43

Open kumar-asista opened 5 years ago

kumar-asista commented 5 years ago

Scheduler starting working properly , if any cassandra connection renew with 9160 has problem. where should be a problem, Cassandra or scheduler library.

Following Exception

java.util.concurrent.ExecutionException: com.netflix.astyanax.connectionpool.exceptions.PoolTimeoutException: PoolTimeoutException: [host=127.0.0.1(127.0.0.1):9160, latency=60001(60001), attempts=2]Timed out waiting for connection Caused by: com.netflix.astyanax.connectionpool.exceptions.PoolTimeoutException:PoolTimeoutException: [host=127.0.0.1(127.0.0.1):9160, latency=60001(60001), attempts=2]Timed out waiting for connection

thanks

kumar-asista commented 5 years ago

I have seen one more case also with same PoolTimeoutException,

We have started first node scheduler application, after some to we try to start another node of scheduler application, both node started but any one of the node we got PoolTimeoutException, that time cassandra server some connection under "TIME_WAIT" state, what was the reason, any help thanks.

kumar-asista commented 5 years ago

Hi got following different cases when debug this case,

  1. Try to add new node as scheduler client app

we have one node scheduler app running when we plan to add additional node and test scheduler functions,

============== Log Started ====================================

2019-01-01 12:44:48 INFO  AbstractCoordinator {org.apache.kafka.clients.consumer.internals.AbstractCoordinator$HeartbeatResponseHandler handle} - [Consumer clientId=consumer-5, groupId=scheduler] Attempt to heartbeat failed since group is rebalancing
2019-01-01 12:44:48 INFO  ConsumerCoordinator {org.apache.kafka.clients.consumer.internals.ConsumerCoordinator onJoinPrepare} - [Consumer clientId=consumer-5, groupId=scheduler] Revoking previously assigned partitions []
2019-01-01 12:44:48 INFO  Scheduler$ {com.pagerduty.scheduler.Scheduler$LoggingImpl trackResourceShutdown} - Shutting down SchedulingSystem...
2019-01-01 12:44:48 INFO  Scheduler$ {com.pagerduty.scheduler.Scheduler$LoggingImpl trackResourceShutdown} - Shutting down TaskExecutorService...
**2019-01-01 12:44:49 ERROR ConnectionPoolMBeanManager {com.netflix.astyanax.connectionpool.impl.ConnectionPoolMBeanManager unregisterMonitor} - com.netflix.MonitoredResources:type=ASTYANAX,name=DevClusterConnectionPool,ServiceType=connectionpool**
2019-01-01 12:44:49 INFO  Scheduler$ {com.pagerduty.scheduler.Scheduler$LoggingImpl trackResourceShutdown} - TaskExecutorService was shut down.
2019-01-01 12:44:49 INFO  Scheduler$ {com.pagerduty.scheduler.Scheduler$LoggingImpl trackResourceShutdown} - SchedulingSystem was shut down.
2019-01-01 12:44:49 INFO  Scheduler$ {com.pagerduty.scheduler.Scheduler$LoggingImpl reportIsolationDetectionWait} - Holding group=scheduler for topic=scheduler, partitions [] for 16500 milliseconds for proper isolation detection.
2019-01-01 12:44:51 INFO  AbstractCoordinator {org.apache.kafka.clients.consumer.internals.AbstractCoordinator$HeartbeatResponseHandler handle} - [Consumer clientId=consumer-5, groupId=scheduler] Attempt to heartbeat failed since group is rebalancing
2019-01-01 12:44:54 INFO  AbstractCoordinator {org.apache.kafka.clients.consumer.internals.AbstractCoordinator$HeartbeatResponseHandler handle} - [Consumer clientId=consumer-5, groupId=scheduler] Attempt to heartbeat failed since group is rebalancing
2019-01-01 12:44:57 INFO  AbstractCoordinator {org.apache.kafka.clients.consumer.internals.AbstractCoordinator$HeartbeatResponseHandler handle} - [Consumer clientId=consumer-5, groupId=scheduler] Attempt to heartbeat failed since group is rebalancing
2019-01-01 12:45:00 INFO  AbstractCoordinator {org.apache.kafka.clients.consumer.internals.AbstractCoordinator$HeartbeatResponseHandler handle} - [Consumer clientId=consumer-5, groupId=scheduler] Attempt to heartbeat failed since group is rebalancing
2019-01-01 12:45:03 INFO  AbstractCoordinator {org.apache.kafka.clients.consumer.internals.AbstractCoordinator$HeartbeatResponseHandler handle} - [Consumer clientId=consumer-5, groupId=scheduler] Attempt to heartbeat failed since group is rebalancing
2019-01-01 12:45:05 INFO  Scheduler$ {com.pagerduty.scheduler.Scheduler$LoggingImpl reportPartitionsRevoked} - Rebalancing group=scheduler for topic=scheduler, partitions revoked [].
2019-01-01 12:45:05 INFO  AbstractCoordinator {org.apache.kafka.clients.consumer.internals.AbstractCoordinator sendJoinGroupRequest} - [Consumer clientId=consumer-5, groupId=scheduler] (Re-)joining group
2019-01-01 12:45:05 INFO  AbstractCoordinator {org.apache.kafka.clients.consumer.internals.AbstractCoordinator$1 onSuccess} - [Consumer clientId=consumer-5, groupId=scheduler] Successfully joined group with generation 1757
2019-01-01 12:45:05 INFO  ConsumerCoordinator {org.apache.kafka.clients.consumer.internals.ConsumerCoordinator onJoinComplete} - [Consumer clientId=consumer-5, groupId=scheduler] Setting newly assigned partitions [scheduler-0]
2019-01-01 12:45:05 INFO  Scheduler$ {com.pagerduty.scheduler.Scheduler$LoggingImpl reportPartitionsAssigned} - Rebalancing group=scheduler for topic=scheduler, partitions assigned [0].
2019-01-01 12:45:05 INFO  Slf4jLogger {akka.event.slf4j.Slf4jLogger$$anonfun$receive$1 applyOrElse} - Slf4jLogger started

============== Log End ==================================== after this log I have checked connections in cassandra side , I got some connections under TIME_WAIT state.

root@csd12:~# netstat -an | grep 9160 tcp 0 0 0.0.0.0:9160 0.0.0.0:* LISTEN
tcp 0 0 20.300.1.5:9160 20.300.1.1:54554 TIME_WAIT
tcp 0 0 20.300.1.5:9160 20.300.1.1:54556 TIME_WAIT
tcp 0 0 20.300.1.5:9160 20.300.1.2:56248 ESTABLISHED

20.300.1.1 -> node 1 with scheduler application 20.300.1.2 -> node 2 with scheduler application

only node 2 connection in ESTABLISHED state, node connections under TIME_WAIT , node 1 keep trying get new connection always return PoolTimeoutException.

@danwenzel any help on this problem,

thanks

kumar-asista commented 5 years ago

@DWvanGeest @davidrusu any help above mention problem, We are in staging to release this application , any help most welcome. thanks

kumar-asista commented 5 years ago

Hi Pagerduty Team, I have enclosed full log,

@DWvanGeest @davidrusu - when ever rebalancing I got following exceptions, after than scheduler application not working , please look at the stack trace, thanks

2019-01-01 19:29:54 INFO  Scheduler$ {com.pagerduty.scheduler.Scheduler$LoggingImpl$$anonfun$staleTasksGaugeSampleConsumer$1 apply$mcVI$sp} - Scheduler stale tasks result: 0 stale tasks.

2019-01-01 19:30:54 INFO  Scheduler$ {com.pagerduty.scheduler.Scheduler$LoggingImpl$$anonfun$staleTasksGaugeSampleConsumer$1 apply$mcVI$sp} - Scheduler stale tasks result: 0 stale tasks.

2019-01-01 19:31:32 INFO  AbstractCoordinator {org.apache.kafka.clients.consumer.internals.AbstractCoordinator$HeartbeatResponseHandler handle} - [Consumer clientId=consumer-1, groupId=scheduler] Attempt to heartbeat failed since group is rebalancing
2019-01-01 19:31:32 INFO  ConsumerCoordinator {org.apache.kafka.clients.consumer.internals.ConsumerCoordinator onJoinPrepare} - [Consumer clientId=consumer-1, groupId=scheduler] Revoking previously assigned partitions []
2019-01-01 19:31:32 INFO  Scheduler$ {com.pagerduty.scheduler.Scheduler$LoggingImpl trackResourceShutdown} - Shutting down SchedulingSystem...
2019-01-01 19:31:32 INFO  Scheduler$ {com.pagerduty.scheduler.Scheduler$LoggingImpl trackResourceShutdown} - Shutting down TaskExecutorService...
2019-01-01 19:31:33 INFO  Scheduler$ {com.pagerduty.scheduler.Scheduler$LoggingImpl trackResourceShutdown} - TaskExecutorService was shut down.
2019-01-01 19:31:33 INFO  Scheduler$ {com.pagerduty.scheduler.Scheduler$LoggingImpl trackResourceShutdown} - SchedulingSystem was shut down.
2019-01-01 19:31:33 INFO  Scheduler$ {com.pagerduty.scheduler.Scheduler$LoggingImpl reportIsolationDetectionWait} - Holding group=scheduler for topic=scheduler, partitions [] for 16500 milliseconds for proper isolation detection.
2019-01-01 19:31:35 INFO  AbstractCoordinator {org.apache.kafka.clients.consumer.internals.AbstractCoordinator$HeartbeatResponseHandler handle} - [Consumer clientId=consumer-1, groupId=scheduler] Attempt to heartbeat failed since group is rebalancing
2019-01-01 19:31:38 INFO  AbstractCoordinator {org.apache.kafka.clients.consumer.internals.AbstractCoordinator$HeartbeatResponseHandler handle} - [Consumer clientId=consumer-1, groupId=scheduler] Attempt to heartbeat failed since group is rebalancing
2019-01-01 19:31:41 INFO  AbstractCoordinator {org.apache.kafka.clients.consumer.internals.AbstractCoordinator$HeartbeatResponseHandler handle} - [Consumer clientId=consumer-1, groupId=scheduler] Attempt to heartbeat failed since group is rebalancing

2019-01-01 19:31:44 INFO  AbstractCoordinator {org.apache.kafka.clients.consumer.internals.AbstractCoordinator$HeartbeatResponseHandler handle} - [Consumer clientId=consumer-1, groupId=scheduler] Attempt to heartbeat failed since group is rebalancing
2019-01-01 19:31:47 INFO  AbstractCoordinator {org.apache.kafka.clients.consumer.internals.AbstractCoordinator$HeartbeatResponseHandler handle} - [Consumer clientId=consumer-1, groupId=scheduler] Attempt to heartbeat failed since group is rebalancing
2019-01-01 19:31:49 INFO  Scheduler$ {com.pagerduty.scheduler.Scheduler$LoggingImpl reportPartitionsRevoked} - Rebalancing group=scheduler for topic=scheduler, partitions revoked [].
2019-01-01 19:31:49 INFO  AbstractCoordinator {org.apache.kafka.clients.consumer.internals.AbstractCoordinator sendJoinGroupRequest} - [Consumer clientId=consumer-1, groupId=scheduler] (Re-)joining group
2019-01-01 19:31:49 INFO  AbstractCoordinator {org.apache.kafka.clients.consumer.internals.AbstractCoordinator$1 onSuccess} - [Consumer clientId=consumer-1, groupId=scheduler] Successfully joined group with generation 1789
2019-01-01 19:31:49 INFO  ConsumerCoordinator {org.apache.kafka.clients.consumer.internals.ConsumerCoordinator onJoinComplete} - [Consumer clientId=consumer-1, groupId=scheduler] Setting newly assigned partitions [scheduler-0]
2019-01-01 19:31:49 INFO  Scheduler$ {com.pagerduty.scheduler.Scheduler$LoggingImpl reportPartitionsAssigned} - Rebalancing group=scheduler for topic=scheduler, partitions assigned [0].
2019-01-01 19:31:49 INFO  Slf4jLogger {akka.event.slf4j.Slf4jLogger$$anonfun$receive$1 applyOrElse} - Slf4jLogger started

2019-01-01 19:32:35 INFO  AbstractCoordinator {org.apache.kafka.clients.consumer.internals.AbstractCoordinator$HeartbeatResponseHandler handle} - [Consumer clientId=consumer-2, groupId=time-engine] Attempt to heartbeat failed since group is rebalancing
2019-01-01 19:32:35 INFO  ConsumerCoordinator {org.apache.kafka.clients.consumer.internals.ConsumerCoordinator onJoinPrepare} - [Consumer clientId=consumer-2, groupId=time-engine] Revoking previously assigned partitions [time-process-0]
2019-01-01 19:32:35 INFO  AbstractCoordinator {org.apache.kafka.clients.consumer.internals.AbstractCoordinator sendJoinGroupRequest} - [Consumer clientId=consumer-2, groupId=time-engine] (Re-)joining group
2019-01-01 19:32:35 INFO  AbstractCoordinator {org.apache.kafka.clients.consumer.internals.AbstractCoordinator$1 onSuccess} - [Consumer clientId=consumer-2, groupId=time-engine] Successfully joined group with generation 141
2019-01-01 19:32:35 INFO  ConsumerCoordinator {org.apache.kafka.clients.consumer.internals.ConsumerCoordinator onJoinComplete} - [Consumer clientId=consumer-2, groupId=time-engine] Setting newly assigned partitions [time-process-0]
2019-01-01 19:32:35 INFO  TrackPartitionsCommitMode {cakesolutions.kafka.akka.TrackPartitionsCommitMode$$anonfun$onPartitionsAssigned$1$$anonfun$apply$1 apply$mcVJ$sp} - Seeking partition: [time-process-0] to offset [1977]
2019-01-01 19:33:50 WARN  Slf4jConnectionPoolMonitorImpl {com.netflix.astyanax.connectionpool.impl.Slf4jConnectionPoolMonitorImpl incOperationFailure} - PoolTimeoutException: [host=20.300.1.2(20.300.1.2):9160, latency=120013(120013), attempts=2]Timed out waiting for connection
2019-01-01 19:33:50 ERROR LocalActorRefProvider(akka://Scheduler) {akka.event.slf4j.Slf4jLogger$$anonfun$receive$1$$anonfun$applyOrElse$1 apply$mcV$sp} - guardian failed, shutting down system
java.util.concurrent.ExecutionException: com.netflix.astyanax.connectionpool.exceptions.PoolTimeoutException: PoolTimeoutException: [host=20.300.1.2(20.300.1.2):9160, latency=120013(120013), attempts=2]Timed out waiting for connection
    at com.google.common.util.concurrent.AbstractFuture.getDoneValue(AbstractFuture.java:476) ~[guava-19.0.jar:na]
    at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:435) ~[guava-19.0.jar:na]
    at com.google.common.util.concurrent.AbstractFuture$TrustedFuture.get(AbstractFuture.java:79) ~[guava-19.0.jar:na]
    at com.pagerduty.eris.FutureConversions$$anon$1.run(FutureConversions.scala:49) ~[eris-core_2.11-2.0.4.jar:2.0.4]
    at com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:456) ~[guava-19.0.jar:na]
    at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:817) ~[guava-19.0.jar:na]
    at com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:753) ~[guava-19.0.jar:na]
    at com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:634) ~[guava-19.0.jar:na]
    at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:110) ~[guava-19.0.jar:na]
    at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41) ~[guava-19.0.jar:na]
    at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77) ~[guava-19.0.jar:na]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[na:1.8.0_161]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[na:1.8.0_161]
    at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_161]
Caused by: com.netflix.astyanax.connectionpool.exceptions.PoolTimeoutException: PoolTimeoutException: [host=20.300.1.2(20.300.1.2):9160, latency=120013(120013), attempts=2]Timed out waiting for connection
    at com.netflix.astyanax.connectionpool.impl.SimpleHostConnectionPool.waitForConnection(SimpleHostConnectionPool.java:231) ~[astyanax-core-3.6.0.jar:3.6.0]
    at com.netflix.astyanax.connectionpool.impl.SimpleHostConnectionPool.borrowConnection(SimpleHostConnectionPool.java:198) ~[astyanax-core-3.6.0.jar:3.6.0]
    at com.netflix.astyanax.connectionpool.impl.RoundRobinExecuteWithFailover.borrowConnection(RoundRobinExecuteWithFailover.java:84) ~[astyanax-core-3.6.0.jar:3.6.0]
    at com.netflix.astyanax.connectionpool.impl.AbstractExecuteWithFailoverImpl.tryOperation(AbstractExecuteWithFailoverImpl.java:117) ~[astyanax-core-3.6.0.jar:3.6.0]
    at com.netflix.astyanax.connectionpool.impl.AbstractHostPartitionConnectionPool.executeWithFailover(AbstractHostPartitionConnectionPool.java:342) ~[astyanax-core-3.6.0.jar:3.6.0]
    at com.netflix.astyanax.thrift.ThriftColumnFamilyQueryImpl$1.execute(ThriftColumnFamilyQueryImpl.java:186) ~[astyanax-thrift-3.6.0.jar:3.6.0]
    at com.netflix.astyanax.thrift.ThriftColumnFamilyQueryImpl$1$4.call(ThriftColumnFamilyQueryImpl.java:297) ~[astyanax-thrift-3.6.0.jar:3.6.0]
    at com.netflix.astyanax.thrift.ThriftColumnFamilyQueryImpl$1$4.call(ThriftColumnFamilyQueryImpl.java:294) ~[astyanax-thrift-3.6.0.jar:3.6.0]
    at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108) ~[guava-19.0.jar:na]
    ... 5 common frames omitted
2019-01-01 19:33:50 INFO  Scheduler$ {com.pagerduty.scheduler.Scheduler$LoggingImpl trackResourceShutdown} - Shutting down TaskExecutorService...
2019-01-01 19:33:50 INFO  SchedulerKafkaConsumer {com.pagerduty.kafkaconsumer.SimpleKafkaConsumer initializeConsumerAndEnterPollLoop} - Stopping Kafka consumer.
2019-01-01 19:33:50 INFO  Scheduler$ {com.pagerduty.scheduler.Scheduler$LoggingImpl trackResourceShutdown} - Shutting down SchedulingSystem...
2019-01-01 19:33:50 ERROR ConnectionPoolMBeanManager {com.netflix.astyanax.connectionpool.impl.ConnectionPoolMBeanManager unregisterMonitor} - com.netflix.MonitoredResources:type=ASTYANAX,name=DevCluster-ConnectionPool,ServiceType=connectionpool
2019-01-01 19:33:50 INFO  Scheduler$ {com.pagerduty.scheduler.Scheduler$LoggingImpl trackResourceShutdown} - TaskExecutorService was shut down.
2019-01-01 19:33:50 INFO  Scheduler$ {com.pagerduty.scheduler.Scheduler$LoggingImpl trackResourceShutdown} - SchedulingSystem was shut down.
2019-01-01 19:33:50 INFO  SchedulerKafkaConsumer {com.pagerduty.kafkaconsumer.SimpleKafkaConsumer com$pagerduty$kafkaconsumer$SimpleKafkaConsumer$$backoffOnUnhandledExceptionLoop} - Exceptions in the past 60s: 1, max: 5
2019-01-01 19:33:50 ERROR SchedulerKafkaConsumer {com.pagerduty.kafkaconsumer.SimpleKafkaConsumer logExceptionAndDelayRestart} - Unhandled exception, restarting kafka consumer in 129 seconds. Exception class: class akka.pattern.AskTimeoutException
akka.pattern.AskTimeoutException: Recipient[Actor[akka://Scheduler/user/queueSupervisor#119571018]] had already been terminated.
    at akka.pattern.AskableActorRef$.ask$extension(AskSupport.scala:132) ~[akka-actor_2.11-2.3.14.jar:na]
    at akka.pattern.AskableActorRef$.$qmark$extension(AskSupport.scala:144) ~[akka-actor_2.11-2.3.14.jar:na]
    at com.pagerduty.scheduler.akka.SchedulingSystem.persistAndSchedule(SchedulingSystem.scala:92) ~[scheduler_2.11-9.1.2.jar:9.1.2]
    at com.pagerduty.scheduler.SchedulerKafkaConsumer.sendDeserializedRecordsToSchedulingSystem(SchedulerKafkaConsumer.scala:142) ~[scheduler_2.11-9.1.2.jar:9.1.2]
    at com.pagerduty.scheduler.SchedulerKafkaConsumer.processRecords(SchedulerKafkaConsumer.scala:132) ~[scheduler_2.11-9.1.2.jar:9.1.2]
    at com.pagerduty.kafkaconsumer.SimpleKafkaConsumer.pollKafkaConsumer(SimpleKafkaConsumer.scala:273) ~[kafka-consumer_2.11-0.6.1.jar:0.6.1]
    at com.pagerduty.kafkaconsumer.SimpleKafkaConsumer.pollLoop(SimpleKafkaConsumer.scala:267) ~[kafka-consumer_2.11-0.6.1.jar:0.6.1]
    at com.pagerduty.kafkaconsumer.SimpleKafkaConsumer.initializeConsumerAndEnterPollLoop(SimpleKafkaConsumer.scala:252) ~[kafka-consumer_2.11-0.6.1.jar:0.6.1]
    at com.pagerduty.kafkaconsumer.SimpleKafkaConsumer.com$pagerduty$kafkaconsumer$SimpleKafkaConsumer$$backoffOnUnhandledExceptionLoop(SimpleKafkaConsumer.scala:219) ~[kafka-consumer_2.11-0.6.1.jar:0.6.1]
    at com.pagerduty.kafkaconsumer.SimpleKafkaConsumer$$anon$1.run(SimpleKafkaConsumer.scala:127) [kafka-consumer_2.11-0.6.1.jar:0.6.1]

2019-01-01 19:33:54 WARN  Slf4jConnectionPoolMonitorImpl {com.netflix.astyanax.connectionpool.impl.Slf4jConnectionPoolMonitorImpl incOperationFailure} - PoolTimeoutException: [host=20.300.1.1(20.300.1.1):9160, latency=120012(120012), attempts=2]Timed out waiting for connection
2019-01-01 19:33:54 WARN  Slf4jConnectionPoolMonitorImpl {com.netflix.astyanax.connectionpool.impl.Slf4jConnectionPoolMonitorImpl incOperationFailure} - PoolTimeoutException: [host=20.300.1.2(20.300.1.2):9160, latency=120012(120012), attempts=2]Timed out waiting for connection
2019-01-01 19:33:54 ERROR GaugeReporter {com.pagerduty.metrics.gauge.GaugeReporter$$anon$1 run} - Error sampling gauge: 
java.util.concurrent.ExecutionException: com.netflix.astyanax.connectionpool.exceptions.PoolTimeoutException: PoolTimeoutException: [host=20.300.1.1(20.300.1.1):9160, latency=120012(120012), attempts=2]Timed out waiting for connection
    at com.google.common.util.concurrent.AbstractFuture.getDoneValue(AbstractFuture.java:476) ~[guava-19.0.jar:na]
    at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:435) ~[guava-19.0.jar:na]
    at com.google.common.util.concurrent.AbstractFuture$TrustedFuture.get(AbstractFuture.java:79) ~[guava-19.0.jar:na]
    at com.pagerduty.eris.FutureConversions$$anon$1.run(FutureConversions.scala:49) ~[eris-core_2.11-2.0.4.jar:2.0.4]
    at com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:456) ~[guava-19.0.jar:na]
    at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:817) ~[guava-19.0.jar:na]
    at com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:753) ~[guava-19.0.jar:na]
    at com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:634) ~[guava-19.0.jar:na]
    at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:110) ~[guava-19.0.jar:na]
    at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41) ~[guava-19.0.jar:na]
    at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77) ~[guava-19.0.jar:na]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [na:1.8.0_161]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [na:1.8.0_161]
    at java.lang.Thread.run(Thread.java:748) [na:1.8.0_161]
Caused by: com.netflix.astyanax.connectionpool.exceptions.PoolTimeoutException: PoolTimeoutException: [host=20.300.1.1(20.300.1.1):9160, latency=120012(120012), attempts=2]Timed out waiting for connection
    at com.netflix.astyanax.connectionpool.impl.SimpleHostConnectionPool.waitForConnection(SimpleHostConnectionPool.java:231) ~[astyanax-core-3.6.0.jar:3.6.0]
    at com.netflix.astyanax.connectionpool.impl.SimpleHostConnectionPool.borrowConnection(SimpleHostConnectionPool.java:198) ~[astyanax-core-3.6.0.jar:3.6.0]
    at com.netflix.astyanax.connectionpool.impl.RoundRobinExecuteWithFailover.borrowConnection(RoundRobinExecuteWithFailover.java:84) ~[astyanax-core-3.6.0.jar:3.6.0]
    at com.netflix.astyanax.connectionpool.impl.AbstractExecuteWithFailoverImpl.tryOperation(AbstractExecuteWithFailoverImpl.java:117) ~[astyanax-core-3.6.0.jar:3.6.0]
    at com.netflix.astyanax.connectionpool.impl.AbstractHostPartitionConnectionPool.executeWithFailover(AbstractHostPartitionConnectionPool.java:342) ~[astyanax-core-3.6.0.jar:3.6.0]
    at com.netflix.astyanax.thrift.ThriftColumnFamilyQueryImpl$1$3.execute(ThriftColumnFamilyQueryImpl.java:263) ~[astyanax-thrift-3.6.0.jar:3.6.0]
    at com.netflix.astyanax.thrift.ThriftColumnFamilyQueryImpl$1$3$2.call(ThriftColumnFamilyQueryImpl.java:285) ~[astyanax-thrift-3.6.0.jar:3.6.0]
    at com.netflix.astyanax.thrift.ThriftColumnFamilyQueryImpl$1$3$2.call(ThriftColumnFamilyQueryImpl.java:282) ~[astyanax-thrift-3.6.0.jar:3.6.0]
    at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108) ~[guava-19.0.jar:na]
    ... 5 common frames omitted