kafka4beam / brod

Apache Kafka client library for Erlang/Elixir
Apache License 2.0
658 stars 198 forks source link

How to disable auto.commit #540

Open yordis opened 1 year ago

yordis commented 1 year ago

I have been trying to figure out how to disable auto.commit, I searched the docs and the code and tried to figure it out, but I am unfamiliar with the subject.

I would appreciate it if the documentation mentions such a setup since it is an important one we are trying to disable to ensure we do not lose messages.

Thanks in advanced.

zmstone commented 1 year ago

Hi @yordis do you mean you want to control when to commit offset?

if you use the v2 group subscriber, you can make the handle message callback return ack instead of commit. https://github.com/kafka4beam/brod/blob/master/src/brod_group_subscriber_worker.erl#L79 Then call the async commit API when it's ready to commit: https://github.com/kafka4beam/brod/blob/master/src/brod_group_subscriber_v2.erl#L191-L194

yordis commented 1 year ago

do you mean you want to control when to commit offset?

Something around the topic,

From https://docs.confluent.io/platform/current/clients/consumer.html#offset-management

By default, the consumer is configured to auto-commit offsets. Using auto-commit gives you “at least once” delivery: Kafka guarantees that no messages will be missed, but duplicates are possible. Auto-commit basically works as a cron with a period set through the auto.commit.interval.ms configuration property.

That may cause data loss in case of some processing issue, so I am trying to figure out how to disable the auto.commit all together

zmstone commented 1 year ago

do you mean you want to control when to commit offset?

Something around the topic,

From https://docs.confluent.io/platform/current/clients/consumer.html#offset-management

By default, the consumer is configured to auto-commit offsets. Using auto-commit gives you “at least once” delivery: Kafka guarantees that no messages will be missed, but duplicates are possible. Auto-commit basically works as a cron with a period set through the auto.commit.interval.ms configuration property.

That may cause data loss in case of some processing issue, so I am trying to figure out how to disable the auto.commit all together

That looks like a Java client behavior, and I am not entirely sure if it actually means it may cause data loss. brod never does this.