This project adds iceberg consumer to Debezium Server. It could be used to replicate any database(CDC changes) to could as an Iceberg table in realtime. Without requiring Spark, Kafka or Streaming platform. It's possible to consume data in append or upsert modes.
More detail available in Documentation Page Also, check caveats for better understanding the current limitation and proper workaround
For more details, refer to the Documentation Page. Additionally, to fully understand potential challenges please review the Caveats Section
git clone https://github.com/memiiso/debezium-server-iceberg.git
mvn -Passembly -Dmaven.test.skip package
unzip debezium-server-iceberg-dist/target/debezium-server-iceberg-dist*.zip -d appdist
cd appdist
application.properties
file and config it: nano conf/application.properties
, you can check the example
configuration
in application.properties.examplebash run.sh
It's possible to use python to run,operate debezium server
example:
pip install git+https://github.com/memiiso/debezium-server-iceberg.git@master#subdirectory=python
debezium
# running with custom arguments
debezium --debezium_dir=/my/debezium_server/dir/ --java_home=/my/java/homedir/
from debezium import Debezium
d = Debezium(debezium_dir="/dbz/server/dir", java_home='/java/home/dir')
java_args = []
java_args.append("-Dquarkus.log.file.enable=true")
java_args.append("-Dquarkus.log.file.path=/logs/dbz_logfile.log")
d.run(*java_args)
from debezium import DebeziumRunAsyn
java_args = []
java_args.append("-Dquarkus.log.file.enable=true")
java_args.append("-Dquarkus.log.file.path=/logs/dbz_logfile.log")
d = DebeziumRunAsyn(debezium_dir="/dbz/server/dir", java_home='/java/home/dir', java_args=java_args)
d.run()
d.join()
The Memiiso community welcomes anyone that wants to help out in any way, whether that includes reporting problems, helping with documentation, or contributing code changes to fix bugs, add tests, or implement new features. See contributing document for details.