Open jkblume opened 2 years ago
Hi @jkblume
It's been a while since you started testing sloth, do you remain to have the same doubts as the day you created the issue? Do your numbers have more sense now after running sloth for a while?
Numbers Explanation:
892%
: The service it's running with a 91.38% SLI having a target of 99 (1% error is a 100% error budget).99.6%
: Since the 1st of the current month how much error budget is remaining.NaN
: since now, 30 days how much error budget is remaining (if just set there is not enough data).Best,
Hi @slok ,
Proabably digging some old grave here, but I just stumbled across this and would like to know if you can help with my understanding on your statement:
Remaining error budget (30d window) NaN: since now, 30 days how much error budget is remaining (if just set there is not enough data).
What I undetstand with this is - if in case there is any discontinuity of data or, no data is available at any time - the metric will say NaN. Or, there could be a case where the sloth is just setup and the at least last 30 days data is not available - we would see NaN.
The reason why I am asking this is I am not very much able to interpret this in my dashboard that shows negative remaining budget percentage as well as some NaN
s.
Thanks a bunch for this awesome project. susenj
Hi there, thanks for the work on this project. It helps a lot on transfering knowledge on SLO topic to the very complex prometheus queries!
I'm wondering how the error budget is calculated and I can't find any documentation on the project page or the queries.
Does the service have to run for 30 days and then the number of requests is taken, which I need in some form to show the current consumption of the budget? I'm wondering because I can't really figure out the numbers displayed in the dashboard (see image). But it could also be because I am currently only testing the project in a test project, which has only been running for a few hours.