Open abialas opened 8 months ago
I am seeing similar behavior in 1.3.23. It seems to happen intermittently. I think this happens after a rebalance?
It looks like the partitions have been paused and never resumed. I added an endpoint that allows me to see paused partitions. When consumption stops I can see that the partitions are paused. If I use another endpoint to force them to resume the consumption starts again.
@abialas, can you reproduce consistently or is it an intermittent problem like I am seeing.
I don't see how the partitions paused on this line are resumed https://github.com/reactor/reactor-kafka/blob/2ae3abbc7a876008585eef4972d4fd4af30e2263/src/main/java/reactor/kafka/receiver/internals/ConsumerEventLoop.java#L248
The only place I see a resume Is for the partitions in pausedByUs
and these aren't added to that collection.
I have a simple but slow consumer which consumes 1 record at time:
Processing time of method
handleReceivedRecord
is less than 500ms. I understand this consumer is slow and needs to be fixed (because of concurrency). However, in my test I produce just about 3000 records in 1 minute to the topic the above consumer is consuming from. Initially it consumes fine but after some time I see consumer is not consuming anymore. There is no error log or similar.In the logs I see such messages:
and I have to restart consumer instance to fix this. It is also worth to mention that when I disable scaling up of consumers it works fine.
Expected Behavior
Consuming records from topic should not stop.
Actual Behavior
Consuming records from topic is stuck and restart is required.
Your Environment
java -version
): 21.0.2