caddyserver / caddy

Fast and extensible multi-platform HTTP/1-2-3 web server with automatic HTTPS
https://caddyserver.com
Apache License 2.0
58.64k stars 4.05k forks source link

Prometheus Metrics for reverse_proxy upstreams #4140

Open genofire opened 3 years ago

genofire commented 3 years ago

It would be nice to know, which Upstream give which latency, statuscode or __health_status__.

Maybe it is also nice to enable it per path (related issue #4016).

rugk commented 3 years ago

Just duplicated this issue inn https://github.com/caddyserver/caddy/issues/4309 (keep whatever issue you want open :upside_down_face:)

uniuuu commented 2 years ago

Hi @genofire Please check https://grafana.com/grafana/dashboards/14280 With latest caddy and setting up prometheus to /metrics it's already collecting reverse_proxy requests, errors, request/responce time. image Although no status codes.

francislavoie commented 2 years ago

@uniuuu I think that's just showing how many requests reached the reverse_proxy handler. The question was about metrics for which upstreams were hit, i.e. when you have more than one upstream configured for your reverse_proxy.

lfkdev commented 2 years ago

Still a much needed feature. Seeing which upstreams are used is a basic information for a loadbalancer and should be included in the metrics.

mholt commented 2 years ago

Might only be partially resolved by #4935, so reopening

farhan-ct commented 1 year ago

Hi @uniuuu

Could you please explain how you did the /metrics setup? Because simply turning on the metrics api was not giving me any middleware metrics except the admin metrics.

Please check https://github.com/caddyserver/caddy/issues/4309#issuecomment-1353007832

I request you to help me out with this. Thank you!

lfkdev commented 1 year ago

@farhan-ct you can enable /metrics in the Caddyconfig and you'll see metrics like which endpoints are online and if they're in a healthy state. Example:

[..]
caddy_reverse_proxy_upstreams_healthy{upstream="123.123.123.123:1337"} 1
caddy_reverse_proxy_upstreams_healthy{upstream="123.123.124.123:1338"} 0
caddy_reverse_proxy_upstreams_healthy{upstream="123.123.125.123:1339"} 1
[..]

Make sure you use a new Caddy Version, that feature was just introduced a little while ago. https://caddyserver.com/docs/metrics

farhan-ct commented 1 year ago

@farhan-ct you can enable /metrics in the Caddyconfig and you'll see metrics like which endpoints are online and if they're in a healthy state. Example:

[..]
caddy_reverse_proxy_upstreams_healthy{upstream="123.123.123.123:1337"} 1
caddy_reverse_proxy_upstreams_healthy{upstream="123.123.124.123:1338"} 0
caddy_reverse_proxy_upstreams_healthy{upstream="123.123.125.123:1339"} 1
[..]

Make sure you use a new Caddy Version, that feature was just introduced a little while ago. https://caddyserver.com/docs/metrics

Yes I enabled that and am able to see the metrics that you just mentioned. However, you can find in this link [https://github.com/caddyserver/caddy/issues/4309#issuecomment-1353007832](https://github.com/caddyserver/caddy/issues/url) that the @rugk is able to get some other metrics starting with caddyhttp..., request and response stuff. I am struggling on how to get those.

Also, based this PR https://github.com/caddyserver/caddy/pull/3709 You can see the sample code in the introduction that's providing the caddy_http_request* metrics which am unable to find in my v2.6.2 (latest) version

@hairyhenderson

Also, I realised that the metrics I want are available in v2.5.2 image

But as per the doc, enabling the metrics in v2.6.2 should provide these metrics but it doesn't.

hairyhenderson commented 1 year ago

@farhan-ct please read the 2.6 release notes - metrics need to be explicitly enabled now: https://github.com/caddyserver/caddy/releases/tag/v2.6.0

DBarney commented 1 week ago

I realize that it has been a few years since this was opened. But it would be extremely good to be able to break down which upstream was used to generate the response.

I have a config with multiple reverse_proxy blocks, and unfortunately they all get lumped together in the metrics. Which makes them fairly useless as far as metrics go. There is no way to identify which group of upstreams is having issues or problems, and generate alerts on those.

I would even settle for being able to identify different reverse_proxy blocks with a specific label and tag. That way I could separate the metrics even a little bit.