Open mwarkentin opened 2 hours ago
I think the combination of --auto-offset-reset=earliest
and --no-strict-offset-reset
sometimes ends in situations where arroyo resets to an offset that already expired, therefore failing to reset the offset
%4|1729638367.626|OFFSET|rdkafka#consumer-2| [thrd:main]: snuba-spans [62]: offset reset (at offset 70037080713 (leader epoch 4), broker 0) to offset BEGINNING (leader epoch -1): fetch failed due to requested offset not available on the broker: Broker: Offset out of range
@untitaker was that a one time thing? Or does the consumer continually attempt to reset to offsets that are already out of bounds?
Eg. is this an issue only on very high throughput topics, or ones where the consumer takes a while to commit the first batch?
I think it's a race condition in arroyo that allows this to happen. I think we should probably support constructs like --auto-offset-reset=earliest+1h
to self-serve what we already end up doing manually
We should also reconsider (per our discussions) if --no-strict-offset-reset
even needs to be a thing anymore now that we primarily use --auto-offset-reset=earliest
for all of our consumers.
Steps to Reproduce
Not sure, just adding a placeholder for further information.
Expected Result
--no-strict-offset-reset
would work, and enable consumers to reset their own offsets when out of retention.Actual Result
Not sure?