delta-io / kafka-delta-ingest

A highly efficient daemon for streaming data from Kafka into Delta Lake
Apache License 2.0
337 stars 72 forks source link

Introduce recover_dynamodb_lock #79

Closed mosyp closed 2 years ago

mosyp commented 2 years ago

See docs at https://github.com/delta-io/kafka-delta-ingest/pull/79/files#diff-537aa1eb75dd77e0cfc4b6c45bbe0ec80fce6e3c9c8dbf054e8552beb8d746eaR18-R35

mosyp commented 2 years ago

Irrelevant to this PR. @xianwill this is the test error with default offset I was talking about, once in a while it could be reproduced even in ci https://github.com/delta-io/kafka-delta-ingest/runs/3672951682

e.g. the fix https://github.com/delta-io/kafka-delta-ingest/tree/default-offset

mosyp commented 2 years ago

As it turned out, this is only useful for drop/create scenario. And there's no 100% guarantee this to work with multiple writers, only best effort with timestamps.

Closing this, as ppl should stop KDI before doing drop/create to ensure that delta store won't be corrupted