thanos-io / thanos

Highly available Prometheus setup with long term storage capabilities. A CNCF Incubating project.
https://thanos.io
Apache License 2.0
13.13k stars 2.1k forks source link

Fix ExternalLabels() for Prometheus v3.0 #7893

Closed simonpasquier closed 2 weeks ago

simonpasquier commented 2 weeks ago

Changes

Prometheus v3.0.0-rc.0 introduces a new scrape protocol (PrometheusText1.0.0) which is present by default in the global configuration. It breaks the Thanos sidecar when it wants to retrieve the external labels.

This change replaces the use of the Prometheus GlobalConfig struct by a minimal struct which unmarshals only the external_labels key.

See also https://github.com/prometheus-operator/prometheus-operator/issues/7078

Verification

Unit test added.

simonpasquier commented 2 weeks ago

cc @saswatamcode

simonpasquier commented 2 weeks ago

I wonder, if bumping prometheus version in our e2e tests also makes sense (probably good to do once v3 is fully released)?

What we've done for Prometheus operator (and this is how we found out the issue) is to configure a periodic workflow running all our e2e tests with the latest v3.0.0 release (currently rc.0). It gives an early signal on potentially breaking changes.

saswatamcode commented 2 weeks ago

Yup, I think we might need similar! In any case, merging this fix

gebn commented 6 days ago

Heads up - Prometheus v3.0.0 is now out, so folks using the latest Thanos release v0.36.1 will start to see Sidecar breaking.

A workaround is to remove the new PrometheusText1.0.0 from the default list of scrape protocols:

global:
  ...
  scrape_protocols:
  - OpenMetricsText1.0.0
  - OpenMetricsText0.0.1
  - PrometheusText0.0.4