rancher / qa-tasks

List of QA Backlog
1 stars 1 forks source link

Spike for investigating Prometheus metrics export with relabelling #1411

Closed git-ival closed 2 months ago

git-ival commented 2 months ago

Outside of just how to export prometheus metrics, we are interested in how to finetune/optimize what metrics we are tracking and how we can best collect them. With that, we have determined that reducing our metrics cardinality by removing unused, irrellevant or unneeded metrics from our dataset can prove very helpful in reducing noise. Additionally, we can make use of Mimir in addition to prometheus for long-term data storage of metrics. Along with Mimir, there is mimirtool which is a helpful cli tool and golang package which can easily export metric queries as TSDB blocks which can be imported/uploaded to Mimir or another prometheus/grafana instance, or otherwise we can configure remote-write.

Ultimately, we have determined that it is likely best if we setup a long-living Mimir instance which will act as the historical datastore for our efforts. This is not a small task and may take a long time to properly implement and tune to our needs, but it will enable us to easily share results and findings in a more readily available and consumable way.

The following additional tasks have been created as a result of this researc:

I have compiled a list of helpful links that I utilized in my research below.

How to reduce metrics cardinality:

metrics relabelling resources:

remote-write resources: