pyrra-dev / pyrra

Making SLOs with Prometheus manageable, accessible, and easy to use for everyone!
https://demo.pyrra.dev
Apache License 2.0
1.16k stars 101 forks source link

Allow providing an error rate query rather than a error count query #1170

Open LukeDAtkinson opened 2 months ago

LukeDAtkinson commented 2 months ago

We are scraping AWS Cloudwatch metrics from Cloudwatch into Prometheus. We want to define SLOs using these metrics.

AWS Cloudwatch reports an error rate as a proportion of total requests. It is possible to calculate the total number of errors by multiplying this by the total number of requests (i.e.

aws_cloudfront_5xx_error_rate_average * increase(aws_cloudfront_requests_sum_count[5m])

). However, when we try to use such a multiplication expression of two metrics in the error field of the ratio indicator, it causes errors in Pyrra. It seems Pyrra expects to be able to parse a single metric expression from this field.

Would it be possible to either add an errorRate field to the ratio indicator or provide a different type of indicator, and implement the necessary calculation of the total errors in Pyrra to handle this case?