Is your feature request related to a problem? Please describe.
We're currently experiencing an odd bug in Sentinel that results in 100% CPU usage that causes HAProxy healthchecks to fail intermittently. As a workaround we currently restart the Sentinel container manually but it would be great if custom liveness and readiness checks were possible. At the moment we manually modify the health-configmap to update the script but this will be overwritten on updates making it unsuitable long term.
Describe the solution you'd like
My proposal would be to either:
allow the commands in readiness and liveness sections to point to different scripts
allow to re-define those sections altogether.
First option is quite simple as we can now mount custom voumes that could store the alternative scripts and shouldn't require too many changes. Second option is much more flexible but may be an overkill.
Describe alternatives you've considered
Modifying liveness script by hand in config map
Additional context
For anybody experiencing the issue, below is code of the check we use for our 6.2-alpine based deployment (relies on top output format):
sentinel_liveness.sh: |
response=$(
redis-cli \
-a "${SENTINELAUTH}" --no-auth-warning \
-h localhost \
-p 26379 \
ping
)
if [ "$response" != "PONG" ]; then
echo "$response"
exit 1
fi
echo "response=$response"
cpu_usage=$(top -b -n 1 | grep -v 'grep' | grep 'redis-sentinel' | tr -s ' ' | cut -d ' ' -f 9 | cut -d '%' -f 1)
if [ "$cpu_usage" -gt 90 ]; then
echo "CPU usage is ${cpu_usage}"
exit 2
fi
@DandyDeveloper would you be interested in merging a PR with one of the proposed changes?
Is your feature request related to a problem? Please describe. We're currently experiencing an odd bug in Sentinel that results in 100% CPU usage that causes HAProxy healthchecks to fail intermittently. As a workaround we currently restart the Sentinel container manually but it would be great if custom liveness and readiness checks were possible. At the moment we manually modify the health-configmap to update the script but this will be overwritten on updates making it unsuitable long term.
Describe the solution you'd like My proposal would be to either:
First option is quite simple as we can now mount custom voumes that could store the alternative scripts and shouldn't require too many changes. Second option is much more flexible but may be an overkill.
Describe alternatives you've considered Modifying liveness script by hand in config map
Additional context For anybody experiencing the issue, below is code of the check we use for our 6.2-alpine based deployment (relies on top output format):
@DandyDeveloper would you be interested in merging a PR with one of the proposed changes?