caddyserver / caddy

Fast and extensible multi-platform HTTP/1-2-3 web server with automatic HTTPS
https://caddyserver.com
Apache License 2.0
57.73k stars 4.01k forks source link

reverse_proxy/upstreams healthy all false since upgrade from 2.4.6 to 2.5.1 #4792

Open axi92 opened 2 years ago

axi92 commented 2 years ago

Since I upgraded from 2.4.6 to 2.5.1 and did not change anything on my config my monitoring that checks the healthy of every upstream fails.

curl "http://localhost:2019/reverse_proxy/upstreams" | jq -c 'map(select(.healthy == false))' | jq length

This should be 0. I checked because I downgrade back to 2.4.6 and it is fine again.

francislavoie commented 2 years ago

With the changes to implement dynamic upstreams, unfortunately the admin endpoint for upstream health is now broken. This is a known regression.

The issue is that we're no longer using global state for upstream health, so it's kinda fundamentally incompatible now.

axi92 commented 2 years ago

Is there a recommended way to healthcheck a caddy instance?

francislavoie commented 2 years ago

If you care about checking reverse_proxy health, then I don't have an answer for you right now.

If you just need to know "is Caddy running", then you can make a site block like this:

:9090 {
    respond 200 "OK"
}

And if you can't reach that endpoint then Caddy isn't running, I guess.

mholt commented 2 years ago

You could check .fails instead of .healthy -- but it's up to you to determine if that number is too high to be "healthy". That was the problem before v2.5, is that different proxy handlers define healthy to be different things, so it's no longer a global assessment.

francislavoie commented 2 years ago

What Matt means is that if you configured passive health checks, then that should still work, but if you're using active health checks, that's still a problem right now.

I'm going to re-open this because we need to find a solution, ultimately.

mholt commented 2 years ago

I'm not really sure what a proper solution would look like. We'd probably need a way to inspect each of the proxy handlers and access their assessment of each host, then report that in the API response.