confluentinc / kafka-connect-jdbc

Kafka Connect connector for JDBC-compatible databases
Other
21 stars 960 forks source link

Add support for dead-letter queue and ignoring bad messages #721

Open rmoff opened 5 years ago

rmoff commented 5 years ago

It would be useful if the JDBC sink connector had better support for when messages are read that cannot be written to the target database. This includes errors such as missing mandatory fields, wrong data types, data lengths exceeding the target column size, key violations, etc.

At the moment the connector will abort and the user has to somehow fix the pipeline.

It would be useful to add:

rmoff commented 5 years ago

Related: https://rmoff.net/2019/10/15/skipping-bad-records-with-the-kafka-connect-jdbc-sink-connector/

reneveyj commented 4 years ago

Any workaround for this issue?

louisburton commented 4 years ago

Perhaps this enhancement would help alleviate this issue in the short term? https://github.com/confluentinc/kafka-connect-jdbc/pull/765

Simbamoureux commented 4 years ago

Any news on this topic ?

mzbyszynski commented 4 years ago

Looks like there is another PR that addresses this: #890

icecold21 commented 4 years ago

this would be very useful. any update on this.

G3zz commented 4 years ago

It looks like the issue was merged. Any idea when we can expect a new release?

NarasimhaKattunga commented 4 years ago

We have a similar requirement to skip bad records into DLQ and continue task as it is.

When this patch available?

whatsupbros commented 3 years ago

So, what is really the relation between this ticket and #890, which, as I can see, was already merged to confluentinc:10.0.x branch, but then was reverted in #966? (what was the reason for that?)

I checked that in Confluent Platform ver. 6.0.1 with JDBC Connector ver. 10.0.1, and it is still not possible to reroute bad messages, which cause exception on sink database, to a DLQ topic.

Any info on when DLQ support is going to be added for the JDBC Sink Connector?

maurolscla commented 3 years ago

Any updates on this?

rmoff commented 3 years ago

See https://github.com/confluentinc/kafka-connect-jdbc/pull/999

aakashnshah commented 3 years ago

Copied over from the Confluent forum:

You are correct, there was an initial implementation that was merged and later reverted. The reasoning behind this was that we wished to provide more robust testing as well as expand the various scenarios/circumstances in which the error reporting functionality could report errant records. The PR mentioned by @rmoff above is the new PR for this functionality and once merged, should be included in the next release of the connector. Let me know if you have any other questions!

tplazaro commented 3 years ago

Was this feature included in #999? We're using 10.2.2 with errors.tolerance=all, but during SQLException (constraint violation), it went to DLQ but didn't commit to the latest offset. Is this expected?

alozano3 commented 2 years ago

Was this feature included in #999? We're using 10.2.2 with errors.tolerance=all, but during SQLException (constraint violation), it went to DLQ but didn't commit to the latest offset. Is this expected?

Same. We're using 10.1.0 and the offset is not commited when there is a constraint violation.

bertrandcedric commented 2 years ago

Was this feature included in #999? We're using 10.2.2 with errors.tolerance=all, but during SQLException (constraint violation), it went to DLQ but didn't commit to the latest offset. Is this expected?

Same. We're using 10.1.0 and the offset is not commited when there is a constraint violation.

We have the same issue using the connect jdbc 10.2.0.

MKhurana1208 commented 2 years ago

Hi, We are facing the same issue, offsets are not committed on SQLException. Is there any workaround to check if the messages are in DLQ. Thanks, Meenu

denis019 commented 2 years ago

Hi all, I tested with:

Not processed messages are stored in DLQ.

cstmgl commented 1 year ago

Hi all, any update on this, I'm trying to do further testing but I don't understand how errors on the Sink/Put stage of the connectors might end up in the DLQ but at the same time the offset would not move. For me it would be ideal if the connector fails in this cases, but it seems not the case. But if it indeed it fails, I don't get why it still floods the DLQ topic.

Can someone explain me what is the current implementation/expectations? Is there a good to implement this enhancement? I think we would all like to be able to somehow define which type or errors we are able to send to the DLQ and which errors we would like the connector simply fail/ignore. Actually this might be a request more for Kafka framework itself then for some implementation at the individual connectors level.

While I understand that it's expected that each connector plugin handles this on it's own, I still think the framework of Kafka could be improved for this, so I tried to raise an improvement request, if someone else feels this would be interesting, feel free to vote for it. https://issues.apache.org/jira/browse/KAFKA-14699

roadSurfer commented 1 year ago

I have just had to work around this problem, luckily on a test system so I was able to experiment. I had to:

  1. Delete the errant connector
  2. Ensure Kafk Connect was configured with CONNECT_CONNECTOR_CLIENT_CONFIG_OVERRIDE_POLICY: All
  3. Set this in the connector config: "consumer.override.max.poll.records": 1, "batch.size": 1
  4. Register the connector again
  5. Advance the offset past the problem event, restart the connector, wait for next failure, advance, restart, wait...
    • (This was scripted, obviously)
  6. Undo the configuration changes

I had to change the polling/batch config (default 500 IIRC) as I had multiple bad events to deal with and did not know how they were spaced. If that isn't a concern for you, you can skip that part. I don't like not have the poison events in a DLQ, I guess I could modify the script to handle that.

It would be great to know if/when this is coming, or what is required to advance it.

gphilos commented 1 month ago

Hi all,

Any updates on this? We are facing the same issue, offsets are not committed on SQLException, but messages are published in DLQ, using kafka-connect-jdbc:10.6.4.

The behavior is rather strange, the invalid message is sent to the DLQ before ALL retries are exhausted. But event if the retries end, the offset is not committed until another message is processed.