pyrra-dev / pyrra

Making SLOs with Prometheus manageable, accessible, and easy to use for everyone!
https://demo.pyrra.dev
Apache License 2.0
1.25k stars 113 forks source link

Grafana Alerting & Mimir support #1192

Open msvechla opened 5 months ago

msvechla commented 5 months ago

Hello and thanks for this awesome project!

At work we have a setup where we use Grafana Mimir as central storage for our metrics. Additionally we use Grafana Alerting to create alerts in centralised Grafanas based on data in the central Mimir.

I recognise that this might be a bit of a custom setup, however I think there will be other people that might benefit from supporting these tools. I have seen a few other issues about Mimir for example.

To implement this, I think two features would be required:

Would you be open to support these use-cases? I recognise the main scope of this project is Prometheus, nevertheless I think people could benefit from this.

Alternatively, do you have any other recommendations regarding this?

I would be open to create a PR if there is a chance that these use-cases will be supported.

Thanks a lot!

ArthurSens commented 5 months ago

You might want to take a look at two actions I've done in the past :)

https://github.com/ArthurSens/pyrra-generate-action

msvechla commented 5 months ago

You might want to take a look at two actions I've done in the past :)

https://github.com/ArthurSens/pyrra-generate-action

Thanks, unfortunately this will not help in my case as far as I understand.

We need the full automation via an operator, as new ServiceLevelObjective resources are created dynamically inside the cluster by different teams.

mzupan commented 5 months ago

@msvechla what you will run into is that mimir can't load anything from CRD. It expects the recording rules to be setup via its API.

You'd have to write a controller to check for the servicelevelobjectives.pyrra.dev changes and then sync to mimir if you want to go off the CRD

msvechla commented 5 months ago

@msvechla what you will run into is that mimir can't load anything from CRD. It expects the recording rules to be setup via its API.

You'd have to write a controller to check for the servicelevelobjectives.pyrra.dev changes and then sync to mimir if you want to go off the CRD

Thanks, I'm aware. This is is what I outlined in my post above. I wanted to check if this could be handled as part of the pyrra operator

metalmatze commented 4 months ago

I think I'm happy to try adding this as a possible backend to Pyrra. Actually, I've tried a couple of years back and it's just that setting up a cluster is quite involved.

If you want to contribute something I'm definitely open to it. If you can additionally figure out an easy way to create a Mimir cluster for development with Pyrra, that would be amazing.

mzupan commented 4 months ago

@metalmatze mimir has a small deployment example for helm

https://github.com/grafana/mimir/blob/main/operations/helm/charts/mimir-distributed/small.yaml

I'd just shrink the sizes even more though

I use this with orbstack to test mimir.. If you have a branch where you start this i'm happy to write up a doc to get mimir/pyrra setup on a 1 node test cluster you should be able to setup on your laptop

msvechla commented 4 months ago

I think I'm happy to try adding this as a possible backend to Pyrra. Actually, I've tried a couple of years back and it's just that setting up a cluster is quite involved.

If you want to contribute something I'm definitely open to it. If you can additionally figure out an easy way to create a Mimir cluster for development with Pyrra, that would be amazing.

That's great to hear! I might devote some time to this next week then and report back!

msvechla commented 4 months ago

I started playing around a little bit today on a branch: https://github.com/pyrra-dev/pyrra/compare/main...msvechla:pyrra:mimir_support

It's still very WIP, I just wanted to get something working to find a possible implementation path.

Unfortunately I did not find a good Mimir api client for go. I used https://github.com/grafana/mimir/tree/main/pkg/mimirtool/client for now, but this requires us to replace the prometheus module dependency with the mimir fork, which is probably not what we want. Do you have any thoughts on this? We could also just write our own small client for the rules API.

msvechla commented 4 months ago

@metalmatze I created a draft PR #1221 that adds support for Mimir, I think we can discuss the details there.

Regarding Grafana Alerting support I also played around a bit, but this will take a little bit more effort. We can provision AlertingRules via the grafanaalertrulegroups of the Grafana Operator.

However we can not easily translate the PrometheusRules, as conditions are specified differently with Grafana Alerting, which would require us to generate the rules in an intermediary format, so they can easily be converted to Prometheus or Grafana rules.

The PR I opened only focuses on Mimir support for now, so we can create a separate PR or discussion for Grafana Alerting