grafana / loki

Like Prometheus, but for logs.
https://grafana.com/loki
GNU Affero General Public License v3.0
23.8k stars 3.43k forks source link

[Ruler / Query Frontend] Expose the Ruler API via the Query Frontend #5203

Open sherifkayad opened 2 years ago

sherifkayad commented 2 years ago

Is your feature request related to a problem? Please describe. With recent versions of Grafana and the ability to see Alerts from multiple sources (e.g. Prometheus & Loki) in the frontend, we were getting an error in Grafana Failed to load rules state from Loki: 404 from rule state endpoint. Perhaps ruler API is not enabled?

image

The answer to Grafana's question is no, we are sure the Ruler API is enabled šŸ˜„ .. We have a setup where Loki is running in microservices mode, the Ruler is enabled and configured correctly.

After some digging we found out that the reason of the issue is that the Ruler APIs are missing from the Query Frontend component and since we have the Loki Data Source in Grafana pointing to the Query Frontend.

That was proven simply by launching a test pod and doing the following curl commands:

# hitting the ruler service directly
$ curl http://loki-loki-distributed-ruler.logging:3100/prometheus/api/v1/alerts # => returned 200 ok with a valid response
$ curl http://loki-loki-distributed-ruler.logging:3100/prometheus/api/v1/rules # => returned 200 ok with a valid response
$ curl http://loki-loki-distributed-ruler.logging:3100/loki/api/v1/alerts # => returned 200 ok with a valid response
$ curl http://loki-loki-distributed-ruler.logging:3100/loki/api/v1/rules # => returned 200 ok with a valid response

# hitting the query frontend service
$ curl http://loki-loki-distributed-query-frontend.logging:3100/prometheus/api/v1/alerts
404 page not found
$ curl http://loki-loki-distributed-query-frontend.logging:3100/prometheus/api/v1/rules
404 page not found
$ curl http://loki-loki-distributed-query-frontend.logging:3100/loki/api/v1/alerts
404 page not found
$ curl http://loki-loki-distributed-query-frontend.logging:3100/loki/api/v1/rules
404 page not found

Describe the solution you'd like We would really love to see the rules APIs (/loki/api/v1/rules/*, /loki/api/alerts & /api/prom/rules, /api/prom/alerts as well as their /prometheus counterparts) be part of the Query Frontend APIs. The reason is that I simply think that the Query Frontend should act as the main servant for Grafana and that's now a new extended functionality that Grafana offers with the alerting view.

Describe alternatives you've considered An alternative would be of course using a Gateway Reverse Proxy component (e.g. as per the Helm Chart https://github.com/grafana/helm-charts/tree/main/charts/loki-distributed) to handle the routing to the Ruler whenever needed else route to the Query Frontend .. and of course use that Gateway as the Loki Data Source URL.

I am not sure though if that's the right approach .. The Gateway component for me is a bit unneeded to be honest so I would prefer having the Query Frontend do what the Query Frontend should do; serving Grafana.

Additional context Nothing further.

uberbrodt commented 2 years ago

I am experiencing the same behavior on chart version 0.37.3. Is there a workaround?

sherifkayad commented 2 years ago

@uberbrodt the alternative / workaround is as i described in the issue using the gateway proxy component to reverse proxy not only the query frontend but also the ruler APIs to Grafana.

I would love to see the ruler APIs implemented though as part of the query frontend. I hope we can get the opinion of the project maintainers on that issue.

parkedwards commented 2 years ago

yeah we're hitting the same exact behavior - we're using the loki-distributed helm chart for our loki components as well

200s at the ruler level, but 404s at the gateway level

timam commented 2 years ago

I am facing same issue with loki-distributed. @sherifkayad It will be a great help if you please share your workaround implementation.

sherifkayad commented 2 years ago

@timam @parkedwards As I said we're using the Loki Distrubuted Helm Chart (https://github.com/grafana/helm-charts/tree/main/charts/loki-distributed) .. The workaround is as follows:

Of course the last piece here is to configure the Grafana Loki Datasource URL to point to the Gateway and not to the Query Frontend.

timam commented 2 years ago

Thanks a lot @sherifkayad

KeeganOP commented 2 years ago

This needs fixing... Every single person that enables grafana 8 alerts and runs loki in distributed mode will run into this and will have to hack the gateway to be a reverse proxy. Seeing as grafana 8 alerts will be enabled by default from grafana 9 onwards, this really should work out the box or at maximum a flag somewhere.

sherifkayad commented 2 years ago

Again wanted to ask if there is any update from the maintainers on this one? @owen-d @slim-bean

stale[bot] commented 2 years ago

Hi! This issue has been automatically marked as stale because it has not had any activity in the past 30 days.

We use a stalebot among other tools to help manage the state of issues in this project. A stalebot can be very useful in closing issues in a number of cases; the most common is closing issues or PRs where the original reporter has not responded.

Stalebots are also emotionless and cruel and can close issues which are still very relevant.

If this issue is important to you, please add a comment to keep it open. More importantly, please add a thumbs-up to the original issue entry.

We regularly sort for closed issues which have a stale label sorted by thumbs up.

We may also:

We are doing our best to respond, organize, and prioritize all issues but it can be a challenging task, our sincere apologies if you find yourself at the mercy of the stalebot.

alessioga commented 2 years ago

not stale :+1: still happening

sherifkayad commented 2 years ago

Sorry for spamming, but is there any update? @maitainers

dorkamotorka commented 1 year ago

+1

mwikyo commented 7 months ago

+1

gdidok commented 2 months ago

Still happening