newrelic / nr1-slo-r

NR1 SLO-R allows you to define, calculate and report on service-level objective (SLO) attainment.
https://discuss.newrelic.com/t/track-your-service-level-objectives-with-the-slo-r-nerdpack/90046
Apache License 2.0
21 stars 21 forks source link

Calculation Blackout Periods #46

Open ricegi opened 4 years ago

ricegi commented 4 years ago

The ability to specify a blackout period for an SLO definition so that known downtimes will be excluded from the calculations of SLO attainment.

Summary

Allows us to make better SLO designs that represent some of the variable aspects of time based SLO calculation.

Desired Behaviour

there should be a policy dialog with the SLO configuration - ability to specify a recurring policy or a one-off period of time. These should persist with the policy. Probably an array of them or something like that.

Possible Solution

as above dialog a "blackouts" section of the slo.json

Additional context

Just want to make the most useful configurator evah!

jeffrey-hines commented 4 years ago

We have operating hours of 7am-8pm, anything outside this time would not be included in the SLO.

norbertsuski commented 4 years ago

@ricegi @tangollama is it enough to provide user some timepicker where blackout hours will be set? Or this should be much more complex to define more than one blackout period? Could you please provide some use case for this?

tangollama commented 4 years ago

@norbertsuski there are two use cases for this. The first is more straightforward. The second more complex and based on what @jeffrey-hines is driving at.

  1. I intentionally took down a system for maintenance last Friday for 1 hour. I want to be able to specify that time period as "out of scope" towards my SLO attainment goals. (i.e. don't penalize myself for an intentional, scheduled, presumably contractually valid maintenance window). That's the goal of this issue.
  2. @jeffrey-hines is referencing a very different idea - that there are operating hours for a compute service outside of which the SLO attainment is invalid. Supporting that feature would be far more involved, and - to be honest - outside of my view of what is universally valuable in a cloud computing service. Let's move that request to a new issue.

@jeffrey-hines, could you perhaps explain the use case you're highlighting further in a new issue?

norbertsuski commented 4 years ago

So the first point looks pretty similar to this one: https://github.com/newrelic/nr1-slo-r/issues/16 ?

ricegi commented 4 years ago

In was thinking of adding this to the SLO configuration - like a section to handle blackout periods in the calculation.