Open rubenvw-ngdata opened 1 year ago
I had some time to work on this, so I did a try to get this functionality myself (but I failed to get it fully working)
See PR https://github.com/grafana/mimir/pull/4858 (I know it is not ready, but sharing it, so you can help me on it)
Thank you for the proposal and the draft PR. I appreciate the time spent. We've been experimenting with the two deployment modes and would like to explore them further as alternatives to microservices mode (maybe even "at scale"). We're not quite there yet, but these deployment modes are also not being deprecated soon.
However, there are some considerations we have to take into account before adding different deployment modes to the helm chart. A couple that come to mind now:
Most of these aren't trivial to answer and there will probably be divided opinions. At the same time we, at Grafana Labs, don't have much visibility into how much read-write or monolithic deployment modes will be used or how much they can scale.
As much as I hate to say it, keeping this functionality in a fork will be more pragmatic as it stands. You can publish the forked chart under a different name and we can track how much usage it gets. With time we can revisit and incorporate the changes in the mimir-distributed
chart and share the maintenance efforts.
Hi @dimitarvdimitrov ,
Thanks for your answer. I'm a bit disappointed though that you propose to leave it on a fork branch.
The most important reason to use mimir for us (and I don't think we are alone) is to make prometheus HA. With the microservices configuration this comes at a high maintenance level with a very fine grained configuration.
I understand that there are various things that you should think about when embedding it into the product; that's also why this is just a draft.
Have you been able to check the error message I was facing with the monolithic setup? I'm willing to continue, rename the chart and maintain the fork for the time being, but I could use a bit of help debugging through the issues that I'm facing (I don't know a lot the mimir internals).
The most important reason to use mimir for us (and I don't think we are alone) is to make prometheus HA. With the microservices configuration this comes at a high maintenance level with a very fine grained configuration.
With the helm chart we are aiming to make this configuration less of a hassle. The defaults in the chart should work for most users. In addition to that monolithic and read-write deployments have the same configuration options as microservices. However, I can see how scaling up/out a microservices deployment is more complicated than scaling a monolithic deployment.
I left a comment on the draft PR wrt the "connection refused" error. I'm happy to help with answers when I can.
To add my two cents, since Grafana Loki already has the "read-write" mode and the helm chart for it, I was sort of expecting to be able to deploy Mimir in the same way if it contains the same component architecture (which is does). So I'm wondering if the considerations listed above are not the equivalents of what has already been done in Grafana Loki?
Monolithic mode is a very important (strategic?) deployment model IMO, because it makes able to start simple with it, and then increase the complexity if the product fits our needs.
ATM, without the monolithic mode, I don't see me deploying mimir
or tempo
in clusters I manage "just for evaluation purpose"… and so I start to look at other tool, even if I already run loki
& grafana
.
As a user, I don't expect any SLA or validation from this chart flavour, just a parameter to deploy it in "target=all".
@davinkevin If you want to try out mimir in monolithic deployment mode, you can use our fork at https://github.com/NGDATA/mimir. Currently we only do internal releases, so if you want to use it, you will have to take care of the release process yourself.
The more usages of the fork, the more likely it gets that this gets embedded in the product.
I like the idea of providing one ore more less complex helm chart solutions for mimir. Why? Because we also tried to deploy the current mimir-distributed
one and it was really though to walk through the values.yaml
. Sure, the chart probably would have run out of the box, but a) we had to apply some modifications and b) my inner nerd wants to know what I am deploying. And here I didn't even look into the templates.
The complex mimir-distrubuted
helm chart definitely has it's use case for larger production deployments. Though, the more simple rollout methods are valuable too. For beginners, but also for scenarios with lower performance requirements.
As the almost 4k lines long values.yaml
is already overwhelming I suggest to really split up into separate helm charts before adding even more complexity to the existing one (with deployment method). This makes your lifes as maintainers easier and the ones of the consumers too, because they can decide upfront which sophisticated kind of helm chart to start with. In fact, they just have to deal with less complex values.yaml
and may understand how the templates work (in case of an issue).
Regarding the sharing of common template functions you may follow a similar approach like Bitnami with a mimir-common
helm chart? See https://github.com/bitnami/charts/tree/main/bitnami/common
@rubenvw-ngdata is the fork still maintained?
It is, we are using it without issues. We do not follow all changes that happen on main immediately though. If there is something that is not working for you, let me know.
Having a monolithic deployment for the helm chart would be awesome for the meta-monitoring chart
Is there any update on this? I really don't understand the decision to have the simplescalable variant for loki but not for mimir 😓
This would be a useful feature where Mimir needs to be deployed for testing. We currently test our observability stack in CI and Mimir, even in a minimal distributed setup, consumes a lot of resources.
Loki can easily just run in SingleBinary
mode for tests, and I had assumed the two would be configurable in the same way.
This would be a useful feature where Mimir needs to be deployed for testing. We currently test our observability stack in CI and Mimir, even in a minimal distributed setup, consumes a lot of resources.
Loki can easily just run in
SingleBinary
mode for tests, and I had assumed the two would be configurable in the same way.
Seconded. This would also be useful in small-footprint/homelab deployments.
Thirded. If that is even a word.
This is really a must have.
Fourthed, if that’s even a word. ;-)
I am surprised to see Mimir still only has the microservices mode available to use via the helm chart, unlike the other deployment options that Loki has. Would like to run mimir simple scalable in my lab for testing.
It would definitely be a useful option especially for smaller clusters or just for experimenting with Mimir before a migration from another Prometheus-like tool
Is your feature request related to a problem? Please describe.
There is currently only a helm chart available for the full microservices deployment mode of Grafana Mimir. This is pretty exhaustive and results in a lot of pods. Ideally there would be an alternative to this.
Describe the solution you'd like
An separate helm chart or a deployment mode configuration in the chart to distinguish the deployment mode (could be in a similar way as what is available for loki). Ideally the alternative deployment solution also supports multi-AZ (where we are running one instance in each AZ)
Describe alternatives you've considered
The only alternative now is to run a minimalistic version of the mimir-distributed helm chart
Additional context
See also previous ticket on grafana/helm-charts: https://github.com/grafana/helm-charts/issues/1189