Open prakkashm opened 6 months ago
Thanks for the report! This is actually the intended behavior, but could definitely be documented better.
'source.offset' = 'group'
means that we should start from an existing consumer group (specified by source.group_id
), rather than initial or latest. If there's no existing consumer group than I don't think there's a sensible behavior here, so we fail.
This feature exists for cases where you have an existing consumer group (for example, from an earlier arroyo job) and you want to pick up where that job left off. If you want to create a new consumer group, choose either 'earliest' or 'latest'.
Sure. As an end user, I would then expect an error on the UI about the non-existing consumer group.
I'm personally of the opinion, it should behave like earliest
when the consumer group is non-existent. This helps in avoiding the additional manual overhead (of changing the earliest
to group
) that a user must do once if they need to restart an existing job to continue from where it stopped the first time.
For a kafka source, when I specify
source.offset
asgroup
with a non-existingsource.group_id
while starting a job for the first time, it fails during runtime without any errors. Possibly, the job is unable to create the new kafka consumer group.