bbalser / elsa

Apache License 2.0
78 stars 10 forks source link

Random "begin_offset=undefined" #90

Open alexcastano opened 4 years ago

alexcastano commented 4 years ago

Hello!

We have a project using Elsa 0.12.3. We have more than 50 topics with 10 partitions each, using the consumer groups functionality with multiple servers. When a consumer group starts, because a new deployment or due to connectivity issues, we randomly receive a similar message to this:

Group member (my_consumer_prod,coor=#PID<0.5097.0>,cb=#PID<0.5096.0>,generation=975): 
assignments received: my-topic: partition=0 begin_offset=16509 partition=4 begin_offset=undefined 
partition=8 begin_offset=55996

The important part is the partition 4 offset is undefined (other topics and partitions are ok), so it starts from the very beginning due to our config:

      group: consumer_group_name,
      topics: [topic],
      handler: Consumer,
      handler_init_args: consumer_init_args,
      direct_ack: false,
      config: [
        begin_offset: :earliest,
        offset_reset_policy: :reset_to_earliest
      ]

However, a few minutes earlier, the offset was set correctly in an identical log entry; and even no messages were consumed between the log lines, but it seems like the offset was deleted or something. Our consumer always returns {:ack, state} and there is no other process using the consumer group.

I don't have more ideas to investigate about this issue. Do you have some ideas about what could it be happening? Did you experience similar issues in the past?

jdenen commented 4 years ago

Hi @alexcastano 👋

Have you looked at the offset for that partition/consumer group in another tool. Kafka has a CLI bash tool that can check it, for example. I'm interested to know what it shows the offset to be?

alexcastano commented 4 years ago

Hello!

Of course! These are a couple of snapshots of the KafkaManager:

image

image

After a new k8s deployment this happens:

image

jdenen commented 4 years ago

I'm not sure what's going on here. @jeffgrunewald and @bbalser: any ideas?