Closed jzakaryan closed 1 year ago
I remember us talking about putting logs around any IO operations that might be happening, should we add some logs there as well?
I remember us talking about putting logs around any IO operations that might be happening, should we add some logs there as well?
- Throughput Provider Interactions.
- ZK Interactions if any.
The consensus at the end of the meeting was that the assignPartitions method is where the CPU time is spent. I avoided adding logs in other places as to keep unnecessary logging to minimum.
We observed that the leader has a tendency to get stuck on this method (in some rare cases for more than 5 minutes). Added log messages to help debug the issue.
Note that the test suite in
TestLoadBasedPartitionAssigner
do not capture such performance problems. Attempts to reproduce them locally didn't work.