Open saswatamcode opened 6 months ago
Even I'm being a huge fan of Thanos and this is almost my first option to go and I think adding more Thanos Components to the Prometheus Operator we start to be very optioned on how users should scale their monitoring system.
I might be wrong but I think the idea behind the Observatorium project is to be this optionated project which unites different projects e.g Prometheus Operator + Thanos Operator + any other components to implement observability
So IMO Prometheus Operator is already kind of opinionated on how you should scale your monitoring with Thanos, as it does ship certain Thanos components, and many maintainers are active contributors to Thanos, who often use Thanos in their day jobs. 🙂
But my concern is how including other Thanos components would increase the responsibilities of Prometheus Operator by a lot, and make it very tricky to configure overall. And it would end up changing its scope entirely, to something like "Scalable Monitoring Operator".
However, if we were to keep the projects separate yet closely knit, and delegate certain things to each other, that might make it more user-friendly. So if you need single cluster monitoring use Prometheus Operator, but if you need to scale, just federate from Querier by spinning up Thanos Operator on top, and so on!
In any case, I think once this operator gains a little bit of maturity, it might make sense to re-evaluate and end up merging, if that is what the community is looking for!
So my suggestion would be to start here and move later on as needed! 🙂
Thinking more about it, my feeling is that for the short term, a dedicated Thanos operator is probably the best approach:
PrometheusRule
CRD. The other Thanos components don't have lot of overlap with the Prometheus operator CRDs so there's less incentive for them to be managed by Prometheus operator.Having said that, it'd be a good idea to keep keep tight collaboration between the 2 operators. For instance:
It's all Go code after all so it should always be possible to create a meta-operator bundling all controllers together if this is what people want (it should even be possible to run the ThanosRuler controller from the Prometheus operator in a future Thanos operator).
Yup, I agree 100%! Cross-referencing docs and ensuring CRDs are consistent between the two would be key. ❤️ And this can all be moved in the future.
There is some discussion around this on the thanos slack channel https://cloud-native.slack.com/archives/CL25937SP/p1711457620191659
Pros:
Cons:
Would love to hear your thoughts on this! 🙂