Open mkovsher opened 1 year ago
Hi.
Is your Prometheus configured to scrape "native histograms"? (--enable-feature=native-histograms
) In that case Prometheus will ignore "classic" histogram with the same name (cortex_request_duration_seconds
in this case). In Prometheus version 2.45.0 and later you can enable scraping of "classic" histograms too by setting scrape_classic_histograms
option in scrape config section of your Prometheus config file.
Hi.
Is your Prometheus configured to scrape "native histograms"? (
--enable-feature=native-histograms
) In that case Prometheus will ignore "classic" histogram with the same name (cortex_request_duration_seconds
in this case). In Prometheus version 2.45.0 and later you can enable scraping of "classic" histograms too by settingscrape_classic_histograms
option in scrape config section of your Prometheus config file.
Hi. Thanks for the quick response.
Yes, my Promethus has native-histograms
feature.
I've conducted several tests based on your recommendation:
1. MimirPlay - it works. I received 2 type of histograms (classic and new). I tested on v.2.8-2.9
I added --enable-feature=native-histograms
to run Prometheus and added scrape_classic_histograms: true
to job in Prometheus config.
HOWEVER...
2. Manual
My Prometheus. I added scrape_classic_histograms: true
manually to each Mimir job in the scrape_configs
section, but didn't get the classic metric as expected, only the new one. (((
3. Helm.
I added the option scrape_classic_histograms: true
to the additionalScrapeConfigs
section for each job added by Prometheus-operator and got an error:
failed to reload config: couldn't load configuration (--config.file="/etc/prometheus/config_out/prometheus.env.yaml"): parsing YAML file /etc/prometheus/config_out/prometheus.env.yaml: found multiple scrape configs with job name "serviceMonitor/mimir/grafana-mimir-alertmanager/0"
I've looked at the generated config file and I have 2 entries for each Mimir job:
additionalScrapeConfigs
section.=== Perhaps you know the answer:
why didn't it work when I added this option manually to jobs?
This option is only supported since Prometheus 2.45.0. Do you use this version everywhere?
How can I add this parameter to helm if I use ServiceMonitor?
I'm not very familiar with Helm, and I don't know answer to this.
Maybe there is a way to specify this parameter globally in Prometheus?
I haven't seen such option. I would suggest opening an issue about it in Prometheus. Another option is to disable native histograms in Prometheus. It's new experimental feature, and if you don't use it yet, it may be better to disable for now.
Why this metric not converted in version 2.8, but began to change in version 2.9?
Mimir 2.9 started exporting cortex_request_duration_seconds
as native histogram too. However it's up to the client like Prometheus to decide whether it will scrape native histograms or not.
This option is only supported since Prometheus 2.45.0. Do you use this version everywhere?
Yes, we use version 2.45.0 everywhere.
I haven't seen such option. I would suggest opening an issue about it in Prometheus. Another option is to disable native histograms in Prometheus. It's new experimental feature, and if you don't use it yet, it may be better to disable for now.
I will consider this option (disable feature) :-).
Mimir 2.9 started exporting
cortex_request_duration_seconds
as native histogram too. However it's up to the client like Prometheus to decide whether it will scrape native histograms or not.Ok. It remains only to configure Prometheus.
Thanks.
If Mimir sends the native histograms, will the dashboards be modified with this in mind?
If Mimir sends the native histograms, will the dashboards be modified with this in mind?
Mimir currently exposes single histogram as "native histogram" -- cortex_request_duration_seconds
. In my opinion the feature needs to be widely deployed and not marked as experimental in Prometheus and Mimir, before we start using native histograms more widely.
After upgrading from version 2.8 (helm 4.4.1) to 2.9. (helm 5.0.0) some dashboards don't work.
Prometheus converts the "cortex_request_duration_seconds_count metric" to "cortex_request_duration_seconds{count:n sum:...)
To Reproduce
Upgrade Mimir 2.8 (helm 4.4.1) to 2.9 (helm 5.0.0.) Goto dashboard Mimir / Writes
Metrics from ingester pod:
cortex_request_duration_seconds_count{method="GET",route="ready",status_code="200",ws="false"} 1078
Metric from Prometheus:
Expected behavior
Should be in Prometheus:
cortex_request_duration_seconds_count{cluster="grafana-mimir", container="ingester", endpoint="http-metrics", instance="10.0.12.179:8080", job="mimir/ingester", method="GET", namespace="mimir", pod="grafana-mimir-ingester-zone-a-0", route="metrics", service="grafana-mimir-ingester-zone-a", status_code="200", ws="false"}
Environment
Kubernetes: 1.26 Mimir: 2.9 (helm chart 5.0.0.) Prometheus: 2.45.0 Grafana: Grafana v10.0.1