slok / sloth

🦥 Easy and simple Prometheus SLO (service level objectives) generator
https://sloth.dev
Apache License 2.0
2.09k stars 173 forks source link

SLOs in Prometheus #487

Open d2k-klin opened 1 year ago

d2k-klin commented 1 year ago

I have deployed latest slot version(v0.11.0) with Helm.

i have defined a PrometheusServiceLevel :

kind: PrometheusServiceLevel
metadata:
  name: sloth-example-profile
  namespace: monitoring
  labels:
    prometheus: prometheus
    role: alert-rules
    app: sloth  
spec:
  service: example-profile
  slos:
    - alerting:
        name: example_profile_availability_alert
        pageAlert:
          labels:
            severity: warning
        ticketAlert:
          labels:
            severity: info
      name: example_profile_availability
      objective: 99
      sli:
        events:
          errorQuery: count_over_time((sum(up{namespace="exampleplatform",pod =~ "example-profile.*"})<1)[{{.window}}:])
          totalQuery: count_over_time(sum(up{namespace="exampleplatform",pod =~ "example-profile.*"})[{{.window}}:])

the PrometheusRule is successfully generated:

NAME                 SERVICE        DESIRED SLOS   READY SLOS   GEN OK   GEN AGE   AGE
sloth-example-profile   example-profile   1              1            true     3m19s     99m

but the Prometheus is not finding metrics for 'slo': image And without this the Grafana dashboards are just empty.

Am i missing something? Is there any additional step for this?

klubi commented 1 year ago

Maybe prometheus does not ingest PrometheusRules because either label or namespace selectors for rules are out of scope.

alexanderjardim commented 1 year ago

I am having the same problem with 0.11. @d2k-klin did you manage to solve the issue?

emdneto commented 1 year ago

@alexanderjardim Probably you're missing the label as @klubi said.

Look in your prometheus config for ruleSelector to confirm the label you're using.