Graylog2 / graylog-plugin-integrations

A collection of open source Graylog integrations that will be released together.
Other
13 stars 14 forks source link

Error while starting Kinesis Input #396

Open lingpri opened 4 years ago

lingpri commented 4 years ago

Description

Error while starting Kinesis Input in local dev environment.

Steps To Reproduce

  1. Setup KinesisCloudWatch Input connected to integration-flowlogs stream in local environment with auth type as key and secret.
  2. All the other configuration is set to default.
  3. The Input is in a running state for a while and when the message, starts coming in... the issue happens.
2020-02-06 21:49:23,434 ERROR: software.amazon.kinesis.leases.ShardSyncTask - Caught exception while sync'ing Kinesis shards and leases
java.lang.RuntimeException: software.amazon.awssdk.core.exception.SdkClientException: Acquire operation took longer than the configured maximum time. This indicates that a request cannot get a connection from the pool within the specified maximum time. This can be due to high request rate.
Consider taking any of the following actions to mitigate the issue: increase max connections, increase acquire timeout, or slowing the request rate.
Increasing the max connections can increase client throughput (unless the network interface is already fully utilized), but can eventually start to hit operation system limitations on the number of file descriptors used by the process. If you already are fully utilizing your network interface or cannot further increase your connection count, increasing the acquire timeout gives extra time for requests to acquire a connection before timing out. If the connections doesn't free up, the subsequent requests will still timeout.
If the above mechanisms are not able to fix the issue, try smoothing out your requests so that large traffic bursts cannot overload the client, being more efficient with the number of times you need to call AWS, or by increasing the number of hosts sending requests.
    at software.amazon.kinesis.retrieval.AWSExceptionManager.apply(AWSExceptionManager.java:65) ~[amazon-kinesis-client-2.2.6.jar:?]
    at software.amazon.kinesis.leases.KinesisShardDetector.listShards(KinesisShardDetector.java:197) ~[amazon-kinesis-client-2.2.6.jar:?]
    at software.amazon.kinesis.leases.KinesisShardDetector.listShards(KinesisShardDetector.java:157) ~[amazon-kinesis-client-2.2.6.jar:?]
    at software.amazon.kinesis.leases.HierarchicalShardSyncer.getShardList(HierarchicalShardSyncer.java:257) ~[amazon-kinesis-client-2.2.6.jar:?]
    at software.amazon.kinesis.leases.HierarchicalShardSyncer.checkAndCreateLeaseForNewShards(HierarchicalShardSyncer.java:81) ~[amazon-kinesis-client-2.2.6.jar:?]
    at software.amazon.kinesis.leases.ShardSyncTask.call(ShardSyncTask.java:67) [amazon-kinesis-client-2.2.6.jar:?]
    at software.amazon.kinesis.metrics.MetricsCollectingTaskDecorator.call(MetricsCollectingTaskDecorator.java:53) [amazon-kinesis-client-2.2.6.jar:?]
    at software.amazon.kinesis.coordinator.Scheduler.initialize(Scheduler.java:271) [amazon-kinesis-client-2.2.6.jar:?]
    at software.amazon.kinesis.coordinator.Scheduler.run(Scheduler.java:235) [amazon-kinesis-client-2.2.6.jar:?]
    at org.graylog.integrations.aws.transports.KinesisConsumer.run(KinesisConsumer.java:134) [classes/:?]
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_242]
    at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_242]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_242]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_242]
    at java.lang.Thread.run(Thread.java:748) [?:1.8.0_242]
Caused by: software.amazon.awssdk.core.exception.SdkClientException: Acquire operation took longer than the configured maximum time. This indicates that a request cannot get a connection from the pool within the specified maximum time. This can be due to high request rate.
Consider taking any of the following actions to mitigate the issue: increase max connections, increase acquire timeout, or slowing the request rate.
Increasing the max connections can increase client throughput (unless the network interface is already fully utilized), but can eventually start to hit operation system limitations on the number of file descriptors used by the process. If you already are fully utilizing your network interface or cannot further increase your connection count, increasing the acquire timeout gives extra time for requests to acquire a connection before timing out. If the connections doesn't free up, the subsequent requests will still timeout.
If the above mechanisms are not able to fix the issue, try smoothing out your requests so that large traffic bursts cannot overload the client, being more efficient with the number of times you need to call AWS, or by increasing the number of hosts sending requests.
    at software.amazon.awssdk.core.exception.SdkClientException$BuilderImpl.build(SdkClientException.java:97) ~[bundle-2.10.41.jar:?]
    at software.amazon.awssdk.core.internal.util.ThrowableUtils.asSdkException(ThrowableUtils.java:98) ~[bundle-2.10.41.jar:?]
    at software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncRetryableStage$RetryExecutor.retryIfNeeded(AsyncRetryableStage.java:125) ~[bundle-2.10.41.jar:?]
    at software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncRetryableStage$RetryExecutor.lambda$execute$0(AsyncRetryableStage.java:107) ~[bundle-2.10.41.jar:?]
    at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774) ~[?:1.8.0_242]
    at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750) ~[?:1.8.0_242]
    at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) ~[?:1.8.0_242]
    at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1990) ~[?:1.8.0_242]
    at software.amazon.awssdk.core.internal.http.pipeline.stages.MakeAsyncHttpRequestStage$ResponseHandler.onError(MakeAsyncHttpRequestStage.java:249) ~[bundle-2.10.41.jar:?]
    at software.amazon.awssdk.http.nio.netty.internal.NettyRequestExecutor.handleFailure(NettyRequestExecutor.java:263) ~[bundle-2.10.41.jar:?]
    at software.amazon.awssdk.http.nio.netty.internal.NettyRequestExecutor.makeRequestListener(NettyRequestExecutor.java:140) ~[bundle-2.10.41.jar:?]
    at software.amazon.awssdk.thirdparty.io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:577) ~[bundle-2.10.41.jar:?]
    at software.amazon.awssdk.thirdparty.io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:551) ~[bundle-2.10.41.jar:?]
    at software.amazon.awssdk.thirdparty.io.netty.util.concurrent.DefaultPromise.access$200(DefaultPromise.java:35) ~[bundle-2.10.41.jar:?]
    at software.amazon.awssdk.thirdparty.io.netty.util.concurrent.DefaultPromise$1.run(DefaultPromise.java:501) ~[bundle-2.10.41.jar:?]
    at software.amazon.awssdk.thirdparty.io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163) ~[bundle-2.10.41.jar:?]
    at software.amazon.awssdk.thirdparty.io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:510) ~[bundle-2.10.41.jar:?]
    at software.amazon.awssdk.thirdparty.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:518) ~[bundle-2.10.41.jar:?]
    at software.amazon.awssdk.thirdparty.io.netty.util.concurrent.SingleThreadEventExecutor$6.run(SingleThreadEventExecutor.java:1044) ~[bundle-2.10.41.jar:?]
    at software.amazon.awssdk.thirdparty.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[bundle-2.10.41.jar:?]
    ... 1 more
Caused by: java.lang.Throwable: Acquire operation took longer than the configured maximum time. This indicates that a request cannot get a connection from the pool within the specified maximum time. This can be due to high request rate.
Consider taking any of the following actions to mitigate the issue: increase max connections, increase acquire timeout, or slowing the request rate.
Increasing the max connections can increase client throughput (unless the network interface is already fully utilized), but can eventually start to hit operation system limitations on the number of file descriptors used by the process. If you already are fully utilizing your network interface or cannot further increase your connection count, increasing the acquire timeout gives extra time for requests to acquire a connection before timing out. If the connections doesn't free up, the subsequent requests will still timeout.
If the above mechanisms are not able to fix the issue, try smoothing out your requests so that large traffic bursts cannot overload the client, being more efficient with the number of times you need to call AWS, or by increasing the number of hosts sending requests.
    at software.amazon.awssdk.http.nio.netty.internal.NettyRequestExecutor.decorateException(NettyRequestExecutor.java:269) ~[bundle-2.10.41.jar:?]
    at software.amazon.awssdk.http.nio.netty.internal.NettyRequestExecutor.handleFailure(NettyRequestExecutor.java:262) ~[bundle-2.10.41.jar:?]
    at software.amazon.awssdk.http.nio.netty.internal.NettyRequestExecutor.makeRequestListener(NettyRequestExecutor.java:140) ~[bundle-2.10.41.jar:?]
    at software.amazon.awssdk.thirdparty.io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:577) ~[bundle-2.10.41.jar:?]
    at software.amazon.awssdk.thirdparty.io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:551) ~[bundle-2.10.41.jar:?]
    at software.amazon.awssdk.thirdparty.io.netty.util.concurrent.DefaultPromise.access$200(DefaultPromise.java:35) ~[bundle-2.10.41.jar:?]
    at software.amazon.awssdk.thirdparty.io.netty.util.concurrent.DefaultPromise$1.run(DefaultPromise.java:501) ~[bundle-2.10.41.jar:?]
    at software.amazon.awssdk.thirdparty.io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163) ~[bundle-2.10.41.jar:?]
    at software.amazon.awssdk.thirdparty.io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:510) ~[bundle-2.10.41.jar:?]
    at software.amazon.awssdk.thirdparty.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:518) ~[bundle-2.10.41.jar:?]
    at software.amazon.awssdk.thirdparty.io.netty.util.concurrent.SingleThreadEventExecutor$6.run(SingleThreadEventExecutor.java:1044) ~[bundle-2.10.41.jar:?]
    at software.amazon.awssdk.thirdparty.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[bundle-2.10.41.jar:?]
    ... 1 more
Caused by: java.util.concurrent.TimeoutException: Acquire operation took longer than 10000 milliseconds.

Environment

lingpri commented 4 years ago

@danotorrey @ceruleancee - Please advice your thoughts on the issue above.

lingpri commented 4 years ago

Example on how the kinesis input is set.

kinesis_input

danotorrey commented 4 years ago

Very interesting. I've never seen this error before. Looks like some kind of timeout. Is this error still reproducible? Wondering if this could have been a temporary issue?