Closed log1-c closed 1 year ago
Possible fix:
Change line 37 of check_patroni/cluster.py
to
yield nagiosplugin.Metric("state_running", status_counters["running"] + status_counters.get("streaming", 0))
That way it is also compatible with previous versions
Hi sorry for the long delay, I was in holiday. Will look into this asap.
Hello,
I made some additional changes beyond my initial plan. While the check you originally suggested was satisfactory, the performance data it provided turned out to be misleading:
CLUSTERNODECOUNT OK - members is 2 | members=2 role_leader=1 role_replica=1 state_running=2 state_streaming=1
To address this, I introduced a new healthy_members
performance data value,
combining running
and streaming
nodes statuses. This adjustment ensures that
we continue monitoring the same states as before and maintain accurate checks.
The revised output of the check is as follows:
CLUSTERNODECOUNT OK - members is 2 | healthy_members=2 members=2 role_leader=1 role_replica=1 state_running=1 state_streaming=1
The key modifications are:
* The existing `--running-[warning|critical]` option is now designated
as `--healthy-[warning|critical]`.
* Introduction of the `healthy_member` perfdata, which serves as the
reference point for the aforementioned options.
* Updates to documentation, help messages, and tests.
I plan to commit these changes soon and will be addressing a few other issues throughout the week. Hopefully, I'll be able to finalize a new release by the end of the week.
Thank you.
With the latest update to v3.0.4 patroni changed it state string for replica nodes https://patroni.readthedocs.io/en/latest/releases.html#version-3-0-4
Sadly I couldn't quite figure out how the counting is done exactly or I would have included a PR.
Cheers and a big thanks for the check!