Open Lusitaniae opened 2 months ago
Alternatively this could be moved into an external exporter that gathers network wide metrics from a single place
because lots of duplicate metrics for each near node we're running (if we had 100 nodes, we'd have 100x the exact same metrics everywhere)
Where are you getting the cardinality screenshot from? It might be useful to keep a reference handy for this.
Though I imagine we could also implement a cardinality check in neard itself, e.g. at the time when those metrics are gathered together in order to respond to a GET /metrics
.
The dashboard is from vmui https://docs.victoriametrics.com/#vmui (this is a fork of Prometheus, Victoria Metrics)
There's also projects like https://github.com/thought-machine/prometheus-cardinality-exporter to monitor on this too
I think in the end a near_exporter that providers network wide metrics is probably best
Describe the bug In Prometheus based monitoring systems, metrics with high cardinality (big combination of unique labels) creates issue.
To Reproduce When we pull metrics we'll get something like this for each validator:
Which is highlighted in the screenshot below as having high cardinality (manageable for now)
Expected behavior
num_expected_chunks
,num_expected_chunks
,num_produced_blocks
,num_produced_chunks
should be its own metric instead of a labelnear_validator_expected_chunks{account_id="01node.poolv1.near"} 112 near_validator_expected_chunks{account_id="01node.poolv1.near"} 704 near_validator_produced_blocks{account_id="01node.poolv1.near"} 112 near_validator_produced_chunks{account_id="01node.poolv1.near"} 703
Screenshots
Version (please complete the following information):
Additional context https://docs.victoriametrics.com/faq/#what-is-high-cardinality