grafana / beyla

eBPF-based autoinstrumentation of web applications and network metrics
https://grafana.com/oss/beyla-ebpf/
Apache License 2.0
1.37k stars 97 forks source link

Support HTTP host header in metrics #1270

Open Stono opened 1 day ago

Stono commented 1 day ago

Following on from this thread in slack: https://grafana.slack.com/archives/C05T4PW9E85/p1729578591123529, I think it would be extremely useful to capture the Host for requests.

For context; I want metrics for requests to hosts that are not on my cluster, so cannot be enriched with kubernetes metadata, as an example when connecting to Google, this is what you get:

http_client_request_body_size_bytes_bucket{http_request_method="GET",http_response_status_code="404",http_route="/karl",k8s_cluster_name="",k8s_daemonset_name="",k8s_deployment_name="istio-test-app-1",k8s_namespace_name="istio-test-app-1",k8s_node_name="gke-delivery-platform-normal-20241009-6f705856-hz87",k8s_pod_name="istio-test-app-1-7f9b5f8d4f-x257q",k8s_pod_start_time="2024-10-22 06:20:00 +0000 UTC",k8s_pod_uid="acd4fe95-afae-43ed-90d3-48aa2e819456",k8s_replicaset_name="istio-test-app-1-7f9b5f8d4f",k8s_statefulset_name="",server_address="142.251.31.105",server_port="443",service_name="istio-test-app-1",service_namespace="istio-test-app-1",target_instance="",url_path="/karl",le="0"} 0

server_address isn't particularly useful, what would be useful here would be host: www.google.com. This could be extract from the HTTP request itself so wouldn't require any reverse lookups, and would make this metric significantly more useful.

I've been evaluating https://www.groundcover.com too, which does a similar thing (eBPF introspection of requests) and they capture the host (as well as a few other things) - it's a significant improvement.

Many thanks!

grcevski commented 8 hours ago

Thanks for the suggestion @Stono! I think we can provide a generic way for people to specify which headers they want, there's a similar issue opened here: https://github.com/grafana/beyla/issues/1176, about VirtualHost.

One thing to note, using the host directly can potentially expose you to cardinality explosion with malicious activity. Since we have no control over what's in that field, unlike IP address, people can send random things and those will end up in your metrics database.

Stono commented 8 hours ago

Hey, thanks for the reply. Yeah I'm aware of the cardinality risk 👍 fortunately my use case here is internally constrained systems and on top of that we use a tiered Prometheus setup where we ingest and roll up using recording rules before storing longer term in another instance.

You currently include path by default which feels, in my opinion, even more open to (accidental) cardinality explosions than host!

Stono commented 8 hours ago

Would you like me to close this issue in favour of the other?

grcevski commented 7 hours ago

Yes you are absolutely right about the paths, especially with 404s. Let's keep this issue open for now since the other is about VirtualHost, allows us to provide a generic solution for both.

Stono commented 5 hours ago

Whilst we're on the subject of cardinality i noticed all the http metrics are histograms. This is a challenge because, for example, if i wanted to keep "path" in there because i want to see request counts by paths (but not interested in size or response time by path), i get an explosion of path*buckets metrics.

Something https://istio.io/ do, which works well, is two separate metrics:

Might be something to consider to help users slice accordingly to manage cardinality.