thanos-io / thanos

Highly available Prometheus setup with long term storage capabilities. A CNCF Incubating project.
https://thanos.io
Apache License 2.0
12.73k stars 2.04k forks source link

[Question] dropping store, external labels are not unique #764

Closed eahydra closed 5 years ago

eahydra commented 5 years ago

Thanos, Prometheus and Golang version used thanos, version 0.2.1-master (branch: master, revision: 1cd9ddd14999d6b074f34a4328e03f7ac3b7c26a)

prometheus, version 2.5.0 (branch: release/20181129-11-33, revision: d44536a0186ff59526f500cd964bbfdb025a8d98)

What happened I have two Prometheus instances(A, B) and shard by target ip. I want to add two Prometheus instances(A', B') with the same configuration as A,B instance, but with different external labels. I am confused about how to configure the external labels.

I have tried to configure the external labels of A' as A instance like this:

  external_labels:
    replica: 0  

but got warning log "dropping store, external labels are not unique" from Thanos query.

And I have try add other external labels like this:

  external_labels:
    replica: 0  
    cluster: slave

the warning log is disappeared, but when do query like count(up), got the double values. And check the query result, every point have cluster label.

The architecture like this:

image

What you expected to happen I want to configure replica Prometheus with Thanos query, but get right values. I think I want to do HA for Thanos Store.

How to reproduce it (as minimally and precisely as possible):

Full logs to relevant components

Anything else we need to know

jojohappy commented 5 years ago

Hi, thanks for your reporting.

Maybe you made a mistake, could you provide the value of --query.replica-label flag you setup for Query component?

Thanos uses the replica-label set in external_labels to check which series should be deduplicated. So you should make sure the replica-label is treated as a replica indicator along.

for your case, you can use those settings:

eahydra commented 5 years ago

@jojohappy Thanks for your reply.

Maybe you made a mistake, could you provide the value of --query.replica-label flag you setup for Query component?

I have setup --query.replica-label with 'replica'

My architecture like this: image

As the picture show, the prometheus of master and slave scraped same targets, and want the slave replicas master. When process query request, I want to query from master, but if there are any invalid sidecar or prometheus, then query from slave.

Does Thanos support the architecture?

jojohappy commented 5 years ago

I'm afraid that Thanos doesn't support the failover solution in your architecture. Thanos will fetch series from all reachable stores, whatever there are ha components.

Then questions for your architecture:

eahydra commented 5 years ago
  • what is the difference between prometheus1 and prometheus2?

prometheus1 and prometheus2 stores metric of part target separately by prometheus shard mechanism.

  • what is the difference between shard 0 and shard 1?

Same above.

  • why there are 4 prometheus instances for scraping same targets? I think two prometheus instances are enough maybe. There are also HA pairs.

There are hundreds of thousands targets be scraped by prometheus. So two prometheus instances is not enough for us.

jojohappy commented 5 years ago

@eahydra Thanks!

I think your architecture is right! But you might not set master / slave pairs, Thanos would do the same behavior for every stores.

So just using Thanos Query queries series what you want from all prometheus instances and do not care about the deduplication and merging if your external_labels is correct. Deduplication and merging of metrics collected from Prometheus HA pairs is a key feature for Thanos.

Also you could visit the /store ui to check the health of stores (prometheus instance).

jojohappy commented 5 years ago

For your case, settings suggestion:

eahydra commented 5 years ago

@jojohappy Thanks for your help.

I have solved the issue through setup replica with different value.

Like this:

external_labels:
    replica: master_0
external_labels:
    replica: master_1
external_labels:
    replica: slave_0
external_labels:
    replica: slave_1
bwplotka commented 5 years ago

:+1:

aviralharsh commented 2 years ago

Hi @bwplotka @jojohappy, i am facing issues when all my prometheus replicaset use the same configmap. All prometheus replicas have the same external_labels and Thanos fails with "dropping store, external labels are not unique" error. Is there any way we can assign unique/dynamic external_labels to each replica ? I am using this helm chart - https://github.com/prometheus-community/helm-charts/tree/main/charts/prometheus

hanjm commented 2 years ago

Hi @bwplotka @jojohappy, i am facing issues when all my prometheus replicaset use the same configmap. All prometheus replicas have the same external_labels and Thanos fails with "dropping store, external labels are not unique" error. Is there any way we can assign unique/dynamic external_labels to each replica ? I am using this helm chart - https://github.com/prometheus-community/helm-charts/tree/main/charts/prometheus

This feature could let us use env like POD_NAME as external label. https://prometheus.io/docs/prometheus/latest/feature_flags/#expand-environment-variables-in-external-labels

aviralharsh commented 2 years ago

Hi @bwplotka @jojohappy, i am facing issues when all my prometheus replicaset use the same configmap. All prometheus replicas have the same external_labels and Thanos fails with "dropping store, external labels are not unique" error. Is there any way we can assign unique/dynamic external_labels to each replica ? I am using this helm chart - https://github.com/prometheus-community/helm-charts/tree/main/charts/prometheus

This feature could let us use env like POD_NAME as external label. https://prometheus.io/docs/prometheus/latest/feature_flags/#expand-environment-variables-in-external-labels

Thanks for addressing @hanjm, this seems to be a feature in v2.27+, can this be used with version 2.20 ? If not, any alternatives ?

hanjm commented 2 years ago

Hi @bwplotka @jojohappy, i am facing issues when all my prometheus replicaset use the same configmap. All prometheus replicas have the same external_labels and Thanos fails with "dropping store, external labels are not unique" error. Is there any way we can assign unique/dynamic external_labels to each replica ? I am using this helm chart - https://github.com/prometheus-community/helm-charts/tree/main/charts/prometheus

This feature could let us use env like POD_NAME as external label. https://prometheus.io/docs/prometheus/latest/feature_flags/#expand-environment-variables-in-external-labels

Thanks for addressing @hanjm, this seems to be a feature in v2.27+, can this be used with version 2.20 ? If not, any alternatives ?

not. alternate is use confd (with file backend) as sidecar to process prometheus config,confd has getenv template function. https://github.com/kelseyhightower/confd/blob/master/docs/templates.md#getenv