Closed Nevermore328 closed 2 years ago
@Nevermore328 That's a very good question. I don't have an answer, but I can share our setup just in case it helps:
We have two prometheus and we use thanos-querier as a sidecar. We have a third federated prometheus that gathers metrics from the main prometheus and sends them to prometheus-kafka-adapter. Its config looks like this (simplified for readability):
scrape_configs:
- job_name: 'federate'
honor_labels: true
metrics_path: '/federate'
params:
'match[]':
- container_cpu_usage_seconds_total{id="/"}
- container_memory_usage_bytes{id="/"}
- container_memory_working_set_bytes{id="/"}
# add here any other metric you are interested in or "*"
static_configs:
- targets:
- thanos-querier:10902 # gather metrics from the main prometheus via their thanos sidecars
remote_write:
- url: http://prometheus-kafka-adapter:8080/receive
I'm sure there is a lot of room for improvement in this setup. Any ideas from the community are welcome because I've asked myself the same question many many times.
@palmerabollo Thanks for your reply,but this solution needs a third prometheus, so there's another promblem: if this third prometheus is in single mode, we lose the high availability; if it's in high availability mode,we have the same duplicated data problem...
i have the same problem,mark...
I don't see it wrong to have duplicated data as long as the origin of it is identified. The key to scale the "prometheus matcher" federation instance, is to give the metrics produced by itself an identity
This can be achieved with externalLabels
Prometheus configuration property (see https://prometheus.io/docs/prometheus/latest/configuration/configuration/) by either:
Hi @jpfe-tid, @palmerabollo, Can we configure remote-write from thanos-querier instead of sending directly from prometheus? I need to send the deduplicated data to external kafka.
In high availability mode, we have 2 or more prometheus servers in a cluster,if config remote-write to prometheus-kafka-adapter in each prometheus server,may generate too much duplicated data,is there any way to solve this problem?I know cortex and thanos a little bit, they both can deduplicate data, but seems neither support send data to kafka directly.