databricks / iceberg-kafka-connect

Apache License 2.0
220 stars 49 forks source link

How can I control read / write pressure on sink? #218

Closed almir-magazord closed 8 months ago

almir-magazord commented 8 months ago

Hi!

I have a big Kafka topic and this sink connector is consuming a lot of CPU and RAM, because it tries to do the job very fast on initial snapshot.

What can I change, in the configs, to make the sink process less resource-intesive?

Thanks!

tabmatfournier commented 8 months ago

Nothing to do with the connector, Kafka Connector does not have any throttling mechanisms --you are throttled by IO/cpu only.

Things you can do to slow down: reduce the number of tasks and run on a smaller machine --cpu will slow it down but be careful about RAM (you will OOM if you reduce this too much).

Typically though you don't want to. The best path forward is to get through your backlog of data and reach steady state w/ the producer.