confluentinc / kafka-connect-elasticsearch

Kafka Connect Elasticsearch connector
Other
15 stars 436 forks source link

Connector fails with payloads >20 MB #729

Open philipp94831 opened 11 months ago

philipp94831 commented 11 months ago

When upgrading from v14.0.3 to v14.0.11, we are no longer able to load records larger than 20MB to Elasticsearch. This has been possible before and is needed for our data pipeline. The limit has presumably been introduced by Jackson as it has been already increased with #716. Is it possible to remove the limit and restore the previous behavior?

This has likely been introduced with v14.0.8

org.apache.kafka.connect.errors.ConnectException: Exiting WorkerSinkTask due to unrecoverable exception.
    at org.apache.kafka.connect.runtime.WorkerSinkTask.deliverMessages(WorkerSinkTask.java:618)
    at org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:336)
    at org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:237)
    at org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:206)
    at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:204)
    at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:259)
    at org.apache.kafka.connect.runtime.isolation.Plugins.lambda$withClassLoader$1(Plugins.java:181)
    at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
    at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
    at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: org.apache.kafka.connect.errors.ConnectException: Bulk request failed
    at io.confluent.connect.elasticsearch.ElasticsearchClient$1.afterBulk(ElasticsearchClient.java:443)
    at org.elasticsearch.action.bulk.BulkRequestHandler$1.onFailure(BulkRequestHandler.java:64)
    at org.elasticsearch.action.ActionListener$Delegating.onFailure(ActionListener.java:66)
    at org.elasticsearch.action.ActionListener$RunAfterActionListener.onFailure(ActionListener.java:350)
    at org.elasticsearch.action.ActionListener$Delegating.onFailure(ActionListener.java:66)
    at org.elasticsearch.action.bulk.Retry$RetryHandler.onFailure(Retry.java:123)
    at io.confluent.connect.elasticsearch.ElasticsearchClient.lambda$null$1(ElasticsearchClient.java:216)
    ... 5 more
Caused by: org.apache.kafka.connect.errors.ConnectException: Failed to execute bulk request due to 'com.fasterxml.jackson.core.exc.StreamConstraintsException: String length (20054016) exceeds the maximum length (20000000)' after 6 attempt(s)
    at io.confluent.connect.elasticsearch.RetryUtil.callWithRetries(RetryUtil.java:165)
    at io.confluent.connect.elasticsearch.RetryUtil.callWithRetries(RetryUtil.java:119)
    at io.confluent.connect.elasticsearch.ElasticsearchClient.callWithRetries(ElasticsearchClient.java:490)
    at io.confluent.connect.elasticsearch.ElasticsearchClient.lambda$null$1(ElasticsearchClient.java:210)
    ... 5 more
Caused by: com.fasterxml.jackson.core.exc.StreamConstraintsException: String length (20054016) exceeds the maximum length (20000000)
    at com.fasterxml.jackson.core.StreamReadConstraints.validateStringLength(StreamReadConstraints.java:324)
    at com.fasterxml.jackson.core.util.ReadConstrainedTextBuffer.validateStringLength(ReadConstrainedTextBuffer.java:27)
    at com.fasterxml.jackson.core.util.TextBuffer.finishCurrentSegment(TextBuffer.java:939)
    at com.fasterxml.jackson.core.json.UTF8StreamJsonParser._finishString2(UTF8StreamJsonParser.java:2584)
    at com.fasterxml.jackson.core.json.UTF8StreamJsonParser._finishString(UTF8StreamJsonParser.java:2529)
    at com.fasterxml.jackson.core.json.UTF8StreamJsonParser.getTextCharacters(UTF8StreamJsonParser.java:487)
    at com.fasterxml.jackson.core.JsonGenerator._copyCurrentStringValue(JsonGenerator.java:2777)
    at com.fasterxml.jackson.core.JsonGenerator._copyCurrentContents(JsonGenerator.java:2668)
    at com.fasterxml.jackson.core.JsonGenerator.copyCurrentStructure(JsonGenerator.java:2619)
    at org.elasticsearch.xcontent.json.JsonXContentGenerator.copyCurrentStructure(JsonXContentGenerator.java:396)
    at org.elasticsearch.xcontent.XContentBuilder.copyCurrentStructure(XContentBuilder.java:1090)
    at org.elasticsearch.action.update.UpdateRequest.toXContent(UpdateRequest.java:1001)
    at org.elasticsearch.common.xcontent.XContentHelper.toXContent(XContentHelper.java:462)
    at org.elasticsearch.common.xcontent.XContentHelper.toXContent(XContentHelper.java:447)
    at org.elasticsearch.client.RequestConverters.bulk(RequestConverters.java:244)
    at org.elasticsearch.client.RestHighLevelClient.internalPerformRequest(RestHighLevelClient.java:2167)
    at org.elasticsearch.client.RestHighLevelClient.performRequest(RestHighLevelClient.java:2137)
    at org.elasticsearch.client.RestHighLevelClient.performRequestAndParseEntity(RestHighLevelClient.java:2105)
    at org.elasticsearch.client.RestHighLevelClient.bulk(RestHighLevelClient.java:620)
    at io.confluent.connect.elasticsearch.ElasticsearchClient.lambda$null$0(ElasticsearchClient.java:212)
    at io.confluent.connect.elasticsearch.RetryUtil.callWithRetries(RetryUtil.java:158)
    ... 8 more
    Suppressed: java.lang.IllegalStateException: Failed to close the XContentBuilder
        at org.elasticsearch.xcontent.XContentBuilder.close(XContentBuilder.java:1131)
        at org.elasticsearch.common.xcontent.XContentHelper.toXContent(XContentHelper.java:457)
        ... 16 more
    Caused by: java.io.IOException: Unclosed object or array found
        at org.elasticsearch.xcontent.json.JsonXContentGenerator.close(JsonXContentGenerator.java:456)
        at org.elasticsearch.xcontent.XContentBuilder.close(XContentBuilder.java:1129)
        ... 17 more
philipp94831 commented 10 months ago

The limit can be configured via https://github.com/FasterXML/jackson-core/issues/863#issuecomment-1527630381. A configuration property would be appreciated