confluentinc / kafka-connect-elasticsearch

Kafka Connect Elasticsearch connector
Other
15 stars 435 forks source link

Ignore 'document_parsing_exception' #740

Open hullarb opened 10 months ago

hullarb commented 10 months ago

Hi All,

We are running connector version 14.0.7 with elasticsearch 8.10 with datastreams and we configured ignoring malformed documents. Unfortunately when elasticsearch cannot index some document with document_parsing_exception the connector task fails. Could you add this error to the ignored errors as well? it is thrown also because malformed document issue.

part of our connector config for reference:

behavior.on.malformed.documents: IGNORE
behavior.on.null.values: IGNORE
errors.tolerance: all
errors.deadletterqueue.topic.name: failed-ingestion

and a sample elasticsearch error message what i got after trying to ingest the document manually:

{
  "error" : {
    "root_cause" : [
      {
        "type" : "parsing_exception",
        "reason" : "Failed to parse object: expecting token of type [START_OBJECT] but found [VALUE_STRING]",
        "line" : 1,
        "col" : 234
      }
    ],
    "type" : "document_parsing_exception",
    "reason" : "[1:234] failed to parse field [properties] of type [flattened] in document with id   REDACTED'",
    "caused_by" : {
      "type" : "parsing_exception",
      "reason" : "Failed to parse object: expecting token of type [START_OBJECT] but found [VALUE_STRING]",
      "line" : 1,
      "col" : 234
    }
  },
  "status" : 400
}

Thanks, Bela

hullarb commented 9 months ago

i've opened a PR with a possible fix for this https://github.com/confluentinc/kafka-connect-elasticsearch/pull/748