A tool works with Thanos to make Prometheus horizontal scaleable

RayHuangCN commented 3 years ago

Thanos is a awesome project. we use it to monitor our Kubernetes clusters for years. Recently we shared our solution for large cluster monitoring to anybody need it.

Kvass is a Prometheus horizontal auto-scaling solution, which uses Sidecar to generate special config file only contains part of targets assigned from Coordinator for every Prometheus shard. We use Thanos to get global data view. 😊

We already use Thanos + Kvass to monitor Kubernetes cluster with following size for moths. Just with one Prometheus config file as usual, no federal needed and no hash_mod needed.

1k+ Nodes
60k+ Pods
100k+ Containers

bwplotka commented 3 years ago

Cool, this is quite an amazing user story. (:

We try similar with @domgreen at some point and it sometimes is hard to debug where the scrape target is etc, but for the good use case, that might be extremely helpful.

This issue will get lost, so I would love to see this in form of blog post, so we can link this on our website somewhere OR we could add some new page similar to https://thanos.io/tip/operating/reverse-proxy.md/ but for e.g Use Cases? :thinking:

brancz commented 3 years ago

FWIW Prometheus Operator uses a pretty similar strategy, except it generates a discovery config that distributes the targets via hashmod sharding of the __address__ label and assigns a shard to each Prometheus.

I highly recommend using hashmod sharding, because if you use the coordinator to assign targets, you have no starting point of troubleshooting when something is not discovered, which makes this essentially an unpredictable distributed monitoring system. As a plus, that way you don't need any coordinator, except for distributing the configuration.

RayHuangCN commented 3 years ago

@brancz Thanks for your opinion😊. The solution of Prometheus Operator is cool when targets have similar series scale👍. But that may lead to load imbalance if targets with huge differences series. For example, when cluster size is large, the kube-state-metrics may has more than 3000000 series, we use kube-state-metrics shading, and have 4 or more kube-state-metrics shards. If we use hashmod by __address__, these kube-state-metrics shards may be distributed to the same Prometheus shard and cause OOM, but if we use Coordinator to explore targets scale, and then distribute targets according to real series scale, the load of Prometheus shards are controllable.

On the other hand, all shards will do service discovery if we use hashmod only, and this may causes lots of memory waste especially when cluster is large.Only use Coordinator to do service discovery can save 50% memory usage of Prometheus shards in our case.

If any shard(all replicas) is down, Coordinator can assign targets to health shard immediately.

Coordinator known anything about service discovery and has correct result of /api/v1/targets, it also known the distribution of all targets, we will add some API and expose metrics for debugging in next release.

RayHuangCN commented 3 years ago

@bwplotka Thanks! I think adding page similar to https://thanos.io/tip/operating/reverse-proxy.md/ is good, could you please tell me where to add it ? (-:

bwplotka commented 3 years ago

I would add some page use-cases.md in operating menu. WDYT? (:

RayHuangCN commented 3 years ago

That sounds good! 👍 Anything i can help?

bwplotka commented 3 years ago

Of course! You are more than welcome to create PR with such a markdown page with your use case stated as one. Then team & community can add more items for the basic use case and those more advanced :hugs:

RayHuangCN commented 3 years ago

Ok👌.

RayHuangCN commented 3 years ago

@bwplotka The PR was created ( - :

thanos-io / thanos

A tool works with Thanos to make Prometheus horizontal scaleable #3507