The fix for #28 (PR #30) did not take Standby leaders into accunt; the code in the ClusterHasLeader class is as follows:
is_leader_found = False
for member in item_dict["members"]:
if (
member["role"] in ("leader", "standby_leader")
and member["state"] == "running"
):
is_leader_found = True
break
However, Standby leaders (since Patroni v3.0.4) now also have state streaming not running, resulting in a failed check for cluster_has_leader:
$ sudo patronictl -c /etc/patroni/13-production.yml list
+ Cluster: 13-production ----------------+-----------+----+-----------+
| Member | Host | Role | State | TL | Lag in MB |
+---------+-------------+----------------+-----------+----+-----------+
| db-pg01 | 10.111.11.1 | Replica | streaming | 8 | 0 |
| db-pg02 | 10.111.11.2 | Standby Leader | streaming | 8 | |
+---------+-------------+----------------+-----------+----+-----------+
$ sudo check_patroni -e https://127.0.0.1:8008 cluster_has_leader
CLUSTERHASLEADER CRITICAL - The cluster has no running leader. | has_leader=0;;@0
The fix for #28 (PR #30) did not take Standby leaders into accunt; the code in the
ClusterHasLeader
class is as follows:However, Standby leaders (since Patroni v3.0.4) now also have state
streaming
notrunning
, resulting in a failed check forcluster_has_leader
: