[admin-tool][server] Add new admin-tool command to dump heartbeat status from server; Scan and log stale HB replica
Previous PR #1260 is too complex, this PR only focus on HB related usability improvements
This PR adds two features related to heartbeat:
Add a heartbeat scan thread to periodically run and log lagging resources (every minute by default, this should be good enough not to spam logging). This can be further collected by other logging collecting system and we can easily detect on which host, what replica is lagging by how much.
Add a command to dump heartbeat status from a host. It has 3 optional filter: topic filter, partition filter and lag filter. You can choose to see only specific topic / topic-partition or you can choose to only see resources that are lagging. This serves as the manual helper when (1) might be missing stuff.
How was this PR tested?
Added new integration test
Does this PR introduce any user-facing changes?
[x] No. You can skip the rest of this section.
[ ] Yes. Make sure to explain your proposed changes and call out the behavior change.
[admin-tool][server] Add new admin-tool command to dump heartbeat status from server; Scan and log stale HB replica
Previous PR #1260 is too complex, this PR only focus on HB related usability improvements
This PR adds two features related to heartbeat:
How was this PR tested?
Added new integration test
Does this PR introduce any user-facing changes?