memiiso / debezium-server-iceberg

Replicates any database (CDC events) to Apache Iceberg (To Cloud Storage)
Apache License 2.0
188 stars 35 forks source link

Debezium read from kafka source and write to iceberg #154

Open metalshanked opened 1 year ago

metalshanked commented 1 year ago

Apologies for this basic question, but is there a way to use this library to stream from an existing kafka topic and write to an iceberg table? (i.e. there is no RDBMs involved)

kafka topic --> debezium iceberg sink --> iceberg table (s3)

Currently, not many options apart from raw java api, spark, flink and trino to write to iceberg and this would be a game changer if possible.

Thanks!

ismailsimsek commented 1 year ago

thats great idea, found one project here https://github.com/getindata/kafka-connect-iceberg-sink

article here

same request on iceberg project

metalshanked commented 1 year ago

Thanks @ismailsimsek. That seems like a great solution but requires kafka connect.
standalone debezium w/ vanilla kafka would be awesome

ismailsimsek commented 1 year ago

Now there is a kafka-connect developed by tabular.io

Blog: https://tabular.io/blog/intro-kafka-connect/ Repo: https://github.com/tabular-io/iceberg-kafka-connect