lensesio / stream-reactor

A collection of open source Apache 2.0 Kafka Connector maintained by Lenses.io.
https://lenses.io
Apache License 2.0
988 stars 361 forks source link

S3 Sink Connector fails with an UnknownHostException for the s3 bucket #1133

Open aubreyand opened 2 months ago

aubreyand commented 2 months ago

Issue Guidelines

Please review these questions before submitting any issue?

What version of the Stream Reactor are you reporting this issue for?

AWS S3 Sink 6.3.0

Are you running the correct version of Kafka/Confluent for the Stream reactor release?

Yes

Do you have a supported version of the data source/sink .i.e Cassandra 3.0.9?

Yes

Have you read the docs?

Yes

What is the expected behaviour?

The connector sinks the configured topics to the specifieds3 bucket.

What was observed?

The connector fails repeatedly with java.net.UnknownHostException: redacted.s3.amazonaws.com. The connector creates a consumer group, but never manages to set any offsets.

[Worker-04e21bde9f2091bc6] [2024-04-10 21:33:18,920] ERROR [kafka-production-lenses-s3-sink-2|task-2] Error retrieving listing (io.lenses.streamreactor.connect.aws.s3.sink.seek.IndexManager:145)
[Worker-04e21bde9f2091bc6] software.amazon.awssdk.core.exception.SdkClientException: Received an UnknownHostException when attempting to interact with a service. See cause for the exact endpoint that is failing to resolve. If this is happening on an endpoint that previously worked, there may be a network connectivity issue or your DNS cache could be storing endpoints for too long.
[Worker-04e21bde9f2091bc6]  at software.amazon.awssdk.core.exception.SdkClientException$BuilderImpl.build(SdkClientException.java:111)
[Worker-04e21bde9f2091bc6]  at software.amazon.awssdk.awscore.interceptor.HelpfulUnknownHostExceptionInterceptor.modifyException(HelpfulUnknownHostExceptionInterceptor.java:59)
[Worker-04e21bde9f2091bc6]  at software.amazon.awssdk.core.interceptor.ExecutionInterceptorChain.modifyException(ExecutionInterceptorChain.java:202)
[Worker-04e21bde9f2091bc6]  at software.amazon.awssdk.core.internal.http.pipeline.stages.utils.ExceptionReportingUtils.runModifyException(ExceptionReportingUtils.java:54)
[Worker-04e21bde9f2091bc6]  at software.amazon.awssdk.core.internal.http.pipeline.stages.utils.ExceptionReportingUtils.reportFailureToInterceptors(ExceptionReportingUtils.java:38)
[Worker-04e21bde9f2091bc6]  at software.amazon.awssdk.core.internal.http.pipeline.stages.ExecutionFailureExceptionReportingStage.execute(ExecutionFailureExceptionReportingStage.java:39)
[Worker-04e21bde9f2091bc6]  at software.amazon.awssdk.core.internal.http.pipeline.stages.ExecutionFailureExceptionReportingStage.execute(ExecutionFailureExceptionReportingStage.java:26)
[Worker-04e21bde9f2091bc6]  at software.amazon.awssdk.core.internal.http.AmazonSyncHttpClient$RequestExecutionBuilderImpl.execute(AmazonSyncHttpClient.java:193)
[Worker-04e21bde9f2091bc6]  at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.invoke(BaseSyncClientHandler.java:103)
[Worker-04e21bde9f2091bc6]  at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.doExecute(BaseSyncClientHandler.java:171)
[Worker-04e21bde9f2091bc6]  at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.lambda$execute$1(BaseSyncClientHandler.java:82)
[Worker-04e21bde9f2091bc6]  at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.measureApiCallSuccess(BaseSyncClientHandler.java:179)
[Worker-04e21bde9f2091bc6]  at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.execute(BaseSyncClientHandler.java:76)
[Worker-04e21bde9f2091bc6]  at software.amazon.awssdk.core.client.handler.SdkSyncClientHandler.execute(SdkSyncClientHandler.java:45)
[Worker-04e21bde9f2091bc6]  at software.amazon.awssdk.awscore.client.handler.AwsSyncClientHandler.execute(AwsSyncClientHandler.java:56)
[Worker-04e21bde9f2091bc6]  at software.amazon.awssdk.services.s3.DefaultS3Client.listObjectsV2(DefaultS3Client.java:6428)
[Worker-04e21bde9f2091bc6]  at software.amazon.awssdk.services.s3.paginators.ListObjectsV2Iterable$ListObjectsV2ResponseFetcher.nextPage(ListObjectsV2Iterable.java:153)
[Worker-04e21bde9f2091bc6]  at software.amazon.awssdk.services.s3.paginators.ListObjectsV2Iterable$ListObjectsV2ResponseFetcher.nextPage(ListObjectsV2Iterable.java:144)
[Worker-04e21bde9f2091bc6]  at software.amazon.awssdk.core.pagination.sync.PaginatedResponsesIterator.next(PaginatedResponsesIterator.java:58)
[Worker-04e21bde9f2091bc6]  at scala.collection.convert.JavaCollectionWrappers$JIteratorWrapper.next(JavaCollectionWrappers.scala:42)
[Worker-04e21bde9f2091bc6]  at scala.collection.Iterator$$anon$10.nextCur(Iterator.scala:594)
[Worker-04e21bde9f2091bc6]  at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:608)
[Worker-04e21bde9f2091bc6]  at scala.collection.immutable.List.prependedAll(List.scala:152)
[Worker-04e21bde9f2091bc6]  at scala.collection.immutable.List$.from(List.scala:684)
[Worker-04e21bde9f2091bc6]  at scala.collection.immutable.List$.from(List.scala:681)
[Worker-04e21bde9f2091bc6]  at scala.collection.SeqFactory$Delegate.from(Factory.scala:306)
[Worker-04e21bde9f2091bc6]  at scala.collection.immutable.Seq$.from(Seq.scala:42)
[Worker-04e21bde9f2091bc6]  at scala.collection.IterableOnceOps.toSeq(IterableOnce.scala:1300)
[Worker-04e21bde9f2091bc6]  at scala.collection.IterableOnceOps.toSeq$(IterableOnce.scala:1300)
[Worker-04e21bde9f2091bc6]  at scala.collection.AbstractIterator.toSeq(Iterator.scala:1300)
[Worker-04e21bde9f2091bc6]  at io.lenses.streamreactor.connect.aws.s3.storage.AwsS3StorageInterface.$anonfun$listRecursive$1(AwsS3StorageInterface.scala:86)
[Worker-04e21bde9f2091bc6]  at scala.util.Try$.apply(Try.scala:210)
[Worker-04e21bde9f2091bc6]  at io.lenses.streamreactor.connect.aws.s3.storage.AwsS3StorageInterface.listRecursive(AwsS3StorageInterface.scala:77)
[Worker-04e21bde9f2091bc6]  at io.lenses.streamreactor.connect.aws.s3.sink.seek.IndexManager.seek(IndexManager.scala:142)
[Worker-04e21bde9f2091bc6]  at io.lenses.streamreactor.connect.aws.s3.sink.S3WriterManager.$anonfun$seekOffsetsForTopicPartition$2(S3WriterManager.scala:151)
[Worker-04e21bde9f2091bc6]  at scala.util.Either.flatMap(Either.scala:352)
[Worker-04e21bde9f2091bc6]  at io.lenses.streamreactor.connect.aws.s3.sink.S3WriterManager.$anonfun$seekOffsetsForTopicPartition$1(S3WriterManager.scala:150)
[Worker-04e21bde9f2091bc6]  at scala.util.Either.flatMap(Either.scala:352)
[Worker-04e21bde9f2091bc6]  at io.lenses.streamreactor.connect.aws.s3.sink.S3WriterManager.seekOffsetsForTopicPartition(S3WriterManager.scala:149)
[Worker-04e21bde9f2091bc6]  at io.lenses.streamreactor.connect.aws.s3.sink.S3WriterManager.$anonfun$open$1(S3WriterManager.scala:132)
[Worker-04e21bde9f2091bc6]  at scala.collection.StrictOptimizedIterableOps.map(StrictOptimizedIterableOps.scala:100)
[Worker-04e21bde9f2091bc6]  at scala.collection.StrictOptimizedIterableOps.map$(StrictOptimizedIterableOps.scala:87)
[Worker-04e21bde9f2091bc6]  at scala.collection.immutable.HashSet.map(HashSet.scala:34)
[Worker-04e21bde9f2091bc6]  at io.lenses.streamreactor.connect.aws.s3.sink.S3WriterManager.open(S3WriterManager.scala:132)
[Worker-04e21bde9f2091bc6]  at io.lenses.streamreactor.connect.aws.s3.sink.S3SinkTask.open(S3SinkTask.scala:222)
[Worker-04e21bde9f2091bc6]  at org.apache.kafka.connect.runtime.WorkerSinkTask.openPartitions(WorkerSinkTask.java:640)
[Worker-04e21bde9f2091bc6]  at org.apache.kafka.connect.runtime.WorkerSinkTask.access$1100(WorkerSinkTask.java:71)
[Worker-04e21bde9f2091bc6]  at org.apache.kafka.connect.runtime.WorkerSinkTask$HandleRebalance.onPartitionsAssigned(WorkerSinkTask.java:705)
[Worker-04e21bde9f2091bc6]  at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.invokePartitionsAssigned(ConsumerCoordinator.java:293)
[Worker-04e21bde9f2091bc6]  at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.onJoinComplete(ConsumerCoordinator.java:430)
[Worker-04e21bde9f2091bc6]  at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.joinGroupIfNeeded(AbstractCoordinator.java:449)
[Worker-04e21bde9f2091bc6]  at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:365)
[Worker-04e21bde9f2091bc6]  at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:508)
[Worker-04e21bde9f2091bc6]  at org.apache.kafka.clients.consumer.KafkaConsumer.updateAssignmentMetadataIfNeeded(KafkaConsumer.java:1257)
[Worker-04e21bde9f2091bc6]  at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1226)
[Worker-04e21bde9f2091bc6]  at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1206)
[Worker-04e21bde9f2091bc6]  at org.apache.kafka.connect.runtime.WorkerSinkTask.pollConsumer(WorkerSinkTask.java:457)
[Worker-04e21bde9f2091bc6]  at org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:324)
[Worker-04e21bde9f2091bc6]  at org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:232)
[Worker-04e21bde9f2091bc6]  at org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:201)
[Worker-04e21bde9f2091bc6]  at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:189)
[Worker-04e21bde9f2091bc6]  at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:238)
[Worker-04e21bde9f2091bc6]  at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
[Worker-04e21bde9f2091bc6]  at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
[Worker-04e21bde9f2091bc6]  at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
[Worker-04e21bde9f2091bc6]  at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
[Worker-04e21bde9f2091bc6]  at java.base/java.lang.Thread.run(Thread.java:829)
[Worker-04e21bde9f2091bc6] Caused by: software.amazon.awssdk.core.exception.SdkClientException: Unable to execute HTTP request: redacted.s3.amazonaws.com
[Worker-04e21bde9f2091bc6]  at software.amazon.awssdk.core.exception.SdkClientException$BuilderImpl.build(SdkClientException.java:111)
[Worker-04e21bde9f2091bc6]  at software.amazon.awssdk.core.exception.SdkClientException.create(SdkClientException.java:47)
[Worker-04e21bde9f2091bc6]  at software.amazon.awssdk.core.internal.http.pipeline.stages.utils.RetryableStageHelper.setLastException(RetryableStageHelper.java:223)
[Worker-04e21bde9f2091bc6]  at software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage.execute(RetryableStage.java:83)
[Worker-04e21bde9f2091bc6]  at software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage.execute(RetryableStage.java:36)
[Worker-04e21bde9f2091bc6]  at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
[Worker-04e21bde9f2091bc6]  at software.amazon.awssdk.core.internal.http.StreamManagingStage.execute(StreamManagingStage.java:56)
[Worker-04e21bde9f2091bc6]  at software.amazon.awssdk.core.internal.http.StreamManagingStage.execute(StreamManagingStage.java:36)
[Worker-04e21bde9f2091bc6]  at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.executeWithTimer(ApiCallTimeoutTrackingStage.java:80)
[Worker-04e21bde9f2091bc6]  at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:60)
[Worker-04e21bde9f2091bc6]  at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:42)
[Worker-04e21bde9f2091bc6]  at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallMetricCollectionStage.execute(ApiCallMetricCollectionStage.java:48)
[Worker-04e21bde9f2091bc6]  at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallMetricCollectionStage.execute(ApiCallMetricCollectionStage.java:31)
[Worker-04e21bde9f2091bc6]  at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
[Worker-04e21bde9f2091bc6]  at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
[Worker-04e21bde9f2091bc6]  at software.amazon.awssdk.core.internal.http.pipeline.stages.ExecutionFailureExceptionReportingStage.execute(ExecutionFailureExceptionReportingStage.java:37)
[Worker-04e21bde9f2091bc6]  ... 61 more
[Worker-04e21bde9f2091bc6] Caused by: java.net.UnknownHostException: redacted.s3.amazonaws.com
[Worker-04e21bde9f2091bc6]  at java.base/java.net.InetAddress$CachedAddresses.get(InetAddress.java:797)
[Worker-04e21bde9f2091bc6]  at java.base/java.net.InetAddress.getAllByName0(InetAddress.java:1533)
[Worker-04e21bde9f2091bc6]  at java.base/java.net.InetAddress.getAllByName(InetAddress.java:1386)
[Worker-04e21bde9f2091bc6]  at java.base/java.net.InetAddress.getAllByName(InetAddress.java:1307)
[Worker-04e21bde9f2091bc6]  at org.apache.http.impl.conn.SystemDefaultDnsResolver.resolve(SystemDefaultDnsResolver.java:45)
[Worker-04e21bde9f2091bc6]  at org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:112)
[Worker-04e21bde9f2091bc6]  at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:376)
[Worker-04e21bde9f2091bc6]  at software.amazon.awssdk.http.apache.internal.conn.ClientConnectionManagerFactory$DelegatingHttpClientConnectionManager.connect(ClientConnectionManagerFactory.java:86)
[Worker-04e21bde9f2091bc6]  at org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:393)
[Worker-04e21bde9f2091bc6]  at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:236)
[Worker-04e21bde9f2091bc6]  at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186)
[Worker-04e21bde9f2091bc6]  at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
[Worker-04e21bde9f2091bc6]  at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
[Worker-04e21bde9f2091bc6]  at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
[Worker-04e21bde9f2091bc6]  at software.amazon.awssdk.http.apache.internal.impl.ApacheSdkHttpClient.execute(ApacheSdkHttpClient.java:72)
[Worker-04e21bde9f2091bc6]  at software.amazon.awssdk.http.apache.ApacheHttpClient.execute(ApacheHttpClient.java:254)
[Worker-04e21bde9f2091bc6]  at software.amazon.awssdk.http.apache.ApacheHttpClient.access$500(ApacheHttpClient.java:104)
[Worker-04e21bde9f2091bc6]  at software.amazon.awssdk.http.apache.ApacheHttpClient$1.call(ApacheHttpClient.java:231)
[Worker-04e21bde9f2091bc6]  at software.amazon.awssdk.http.apache.ApacheHttpClient$1.call(ApacheHttpClient.java:228)
[Worker-04e21bde9f2091bc6]  at software.amazon.awssdk.core.internal.util.MetricUtils.measureDurationUnsafe(MetricUtils.java:63)
[Worker-04e21bde9f2091bc6]  at software.amazon.awssdk.core.internal.http.pipeline.stages.MakeHttpRequestStage.executeHttpRequest(MakeHttpRequestStage.java:77)
[Worker-04e21bde9f2091bc6]  at software.amazon.awssdk.core.internal.http.pipeline.stages.MakeHttpRequestStage.execute(MakeHttpRequestStage.java:56)
[Worker-04e21bde9f2091bc6]  at software.amazon.awssdk.core.internal.http.pipeline.stages.MakeHttpRequestStage.execute(MakeHttpRequestStage.java:39)
[Worker-04e21bde9f2091bc6]  at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
[Worker-04e21bde9f2091bc6]  at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
[Worker-04e21bde9f2091bc6]  at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
[Worker-04e21bde9f2091bc6]  at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
[Worker-04e21bde9f2091bc6]  at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptTimeoutTrackingStage.execute(ApiCallAttemptTimeoutTrackingStage.java:73)
[Worker-04e21bde9f2091bc6]  at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptTimeoutTrackingStage.execute(ApiCallAttemptTimeoutTrackingStage.java:42)
[Worker-04e21bde9f2091bc6]  at software.amazon.awssdk.core.internal.http.pipeline.stages.TimeoutExceptionHandlingStage.execute(TimeoutExceptionHandlingStage.java:78)
[Worker-04e21bde9f2091bc6]  at software.amazon.awssdk.core.internal.http.pipeline.stages.TimeoutExceptionHandlingStage.execute(TimeoutExceptionHandlingStage.java:40)
[Worker-04e21bde9f2091bc6]  at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptMetricCollectionStage.execute(ApiCallAttemptMetricCollectionStage.java:50)
[Worker-04e21bde9f2091bc6]  at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptMetricCollectionStage.execute(ApiCallAttemptMetricCollectionStage.java:36)
[Worker-04e21bde9f2091bc6]  at software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage.execute(RetryableStage.java:81)
[Worker-04e21bde9f2091bc6]  ... 73 more

What is your Connect cluster configuration (connect-avro-distributed.properties)?

auto.create.topics.enable=true
default.replication.factor=3
min.insync.replicas=2
num.io.threads=8
num.network.threads=5
num.partitions=50
num.replica.fetchers=2
replica.lag.time.max.ms=30000
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600
socket.send.buffer.bytes=102400
unclean.leader.election.enable=false
zookeeper.session.timeout.ms=18000
log.retention.bytes=-1
log.retention.hours=-1
log.retention.minutes=-1
log.retention.ms=-1
log.retention.bytes=-1
delete.topic.enable=false
log.cleanup.policy=delete

What is your connector properties configuration (my-connector.properties)?

connector.class=io.lenses.streamreactor.connect.aws.s3.sink.S3SinkConnector
connect.s3.kcql=INSERT INTO redacted:lenses-topics SELECT * FROM `*` STOREAS `AVRO` PROPERTIES('store.envelope'=true)
tasks.max=1
connect.s3.aws.region=us-east-1
store.envelope=true
value.converter=org.apache.kafka.connect.converters.ByteArrayConverter
topics.regex=[^_].*
header.converter=org.apache.kafka.connect.converters.ByteArrayConverter
key.converter=org.apache.kafka.connect.converters.ByteArrayConverter

Please provide full log files (redact and sensitive information)

stheppi commented 2 months ago

Hi @aubreyand,

We're eager to help and understand the root cause since we see so many experience the same.

Could you please clarify if you're using MSK Connect or vanilla Connect? Also, it would be beneficial to know if virtual hosting is enabled.

Additionally, we'd appreciate it if you could join our Slack channel here to discuss further. It would be helpful to see the actual hostname rather than just "redacted" (of course, not the real one).

JKCai commented 3 weeks ago

Hey @stheppi , I encounter this issue too. But my experience was a bit different.

====== I'm running with MSK Connect.

This happen when I run sink connector to back up objects to a bucket in xx-region01, and then I run another sink connector to back up objects in xx-region02;

After the above, I create the third sink connector to back up objects to a bucket in xx-region01 again. The third sink connector failed with the messages Received an UnknownHostException when attempting to interact with a service.. The log trace is the same as above. I tried multiple attempts and they both failed.

==== How I resolve this:

==== I am not sure what can be wrong as the config is the same. Even the "new plugin" is created using the same version of lenses.connect.