globocom / slo-generator

Easy setup a service level objective using prometheus
MIT License
129 stars 18 forks source link

Add simple algorithm draft #8

Closed wpjunior closed 4 years ago

wpjunior commented 4 years ago

This simple algorithm will cover four first alert methods.

example:

slos:
  - name: myteam-a.service-a
    objectives:
      availability: 99
      latency:
      - le: 5.0 # 95% < 5s
        target: 95

      - le: 10.0 # 97% < 10s
        target: 97
    labels:
      slack_channel: '_team_a'
      platform: myplatform
    annotations:
      message: Service A Error Budget consumption
      link: https://grafana.myservice.com/URL

    trafficRateRecord:
      expr: |
        sum (rate(http_requests_total{job="service-a"}[$window]))
    errorRateRecord:
      alertMethod: simple
      alertWindow: 30m
      alertWait: 10m
      expr: |
        sum (rate(http_requests_total{job="service-a", status="5xx"}[$window])) /
        sum (rate(http_requests_total{job="service-a"}[$window]))
    latencyRecord:
      alertMethod: simple
      alertWindow: 30m
      alertWait: 10m
      expr: |
        sum (rate(http_request_duration_seconds_bucket{job="service-a", le="$le"}[$window])) /
        sum (rate(http_requests_total{job="service-a"}[$window]))

@pedrokiefer and @vierno could you review my merge request ?

freeseacher commented 4 years ago

up ?

freeseacher commented 4 years ago

@wpjunior wow! can you please update docs and example to reflect changes ?

wpjunior commented 4 years ago

sure @freeseacher, I've been updated de README.md, any question that you have, please ask me.

Thanks.