grafana / k6

A modern load testing tool, using Go and JavaScript - https://k6.io
GNU Affero General Public License v3.0
25.47k stars 1.26k forks source link

High Cardinality Metrics - Prometheus Remote Write (experimental-prometheus-rw) #3761

Open jameshounshell opened 4 months ago

jameshounshell commented 4 months ago

Feature Description

When using the Prometheus Remote Write functionality it is not currently possible to limit what labels are included in the series sent to prometheus. Currently we have an issue where the full url is included in the k6_http_ prefixed metrics where the unique id's randomly generated by some of our developers tests absolutely explode the cardinality leading to poor performance when querying or when the developers use the k6 grafana dashboard for time spans longer than a few minutes.

For example this urls like these result in high cardinality within prometheus http://redacted.redacted.svc.cluster.local:8080/api/v1/applications/2b38e5b6-7e72-41a2-9f4d-9f2e51787e78

I could ask the developers to use a smaller fixed number of ID's but this doesn't seem feasible in the long run.

Suggested Solution (optional)

I'm looking for something similar to Prometheus' relabel config functionality (regex matching and manipulation) or have the k6 library have some way to mark/sanitize urls containing unique ID's (ex: opentelemetry tracing does this automatically based on common http server frameworks).

Alternatively even some command line flags to drop certain labels from the emitted metrics would be helpful.

Already existing or connected issues / PRs (optional)

No response

Edit/Update

jameshounshell commented 4 months ago

We've arrived at a solution where we use the url grouping with the tag function.

This is good enough but it leaves it to the developers to implement and I can imagine many will forget and we'll still have to chase down the offending k6 test and deal with the fallout of whatever cardinality explosion happens before we catch it.

I'd still like it if there were a way to control this functionality with an environment variable (or config file) so that we as the platform engineering team can enforce it with a kyverno policy, etc. We manage a internal k6 helm chart for the developers so we'd have the opportunity set it there.

artem-zherdiev-ingio commented 2 weeks ago

We seem to have a similar problem. We are using browser tests with prometheus RW and it sends a lot of url metrics and url relabeling is absolutely necessary. Grouping tags in the browser seems to be impossible, and remote write_relabel_config doesn't work very well, some urls are relabeled and some are not (even with the correct regexes).

It would be great to have relabeling in k6.