Closed mj0nez closed 1 month ago
I guess I've found the reason... The StreamingProcessor
correctly calls the strategy and therefore also our commit step, but right before Healthcheck
and Commit
there is a Produce
step which only polls the next step if the queue is not empty. I guess we should add a guard clause to the poll method because otherwise this results in a deadlock if no new messages are coming in. I’ve discovered this because our container orchestrator constantly marked the pod as unhealthy.
nailed it, thanks!
Environment
What version are you running?
Steps to Reproduce
Hi, we have a more like batch throughput, with a high ingestion rate and long periods of no incoming messages. I have noticed that the consumer group offset never reaches 0 during those idle periods, unless we force a flush with a graceful shutdown. Meanwhile all messages were consumed because the output topic contains the exact number of expected messages.
I tried to find the issue on my own but ran out of luck. The following example should reproduce the issue:
Expected Result
If
min_commit_frequency_sec
has passed between calls ofProcessingStrategy.poll
, I would expect the consumer to commit its offsets.Actual Result
Although no new messages were ingested, and the processor was running (observed via arroyo.consumer.run.count) it held on to its offsets and did not commit them to the broker but does output these log lines:
The first log line is generated by a function wrapped in a
RunTask
that is beforeCommitOffsets
in the strategy, and is exactly the offset that is pending commit.During debugging, I have checked if the
CommitOffsets
strategy correctly calls on submit and poll, which it does. I have also verified that if I add aself.__commit({})
call to the theStreamProcessor
’s run_once method, as new branch when it does not get a message (here), the issue does not persist.