This PR builds on the changes introduced in #36 and adds an health checker that periodically polls the Prometheus's health endpoint.
It also introduces a timeout for every HTTP calls being made (notifier, and readiness).
Closes #35
How to Test This PR?
It turns our that the quit endpoint allows us to simulate a container failure, yay.
make run
# port forward to one of the elector pod.
k port-forward prometheus-elector-dev-1 9090:9090
# politely ask prometheus to stop.
curl -XPUT localhost:9090/-/quit
# monitor the logs of this pod
The expected behavior is the following:
The Prometheus container stops and starts failing healthchecks
prometheus-elector picks up that failure after ~15s and leave the election. (if prometheus restarts too quickly don't be afraid to insist on the quit endpoint 😁)
When the prometheus container comes back up, it passes healtcheck, after a while prometheus-elector detects it and join back the election.
What Does This PR do?
This PR builds on the changes introduced in #36 and adds an health checker that periodically polls the Prometheus's health endpoint.
It also introduces a timeout for every HTTP calls being made (notifier, and readiness).
Closes #35
How to Test This PR?
It turns our that the quit endpoint allows us to simulate a container failure, yay.
The expected behavior is the following:
prometheus-elector
picks up that failure after ~15s and leave the election. (if prometheus restarts too quickly don't be afraid to insist on the quit endpoint 😁)Good PR Checklist
Additional notes
Review commit by commit is encouraged!