checkly / public-roadmap

Checkly public roadmap. All planned features, updates and tweaks.
https://checklyhq.com
37 stars 7 forks source link

Adjustable check frequency upon failed requests #298

Open peter-dolkens opened 1 year ago

peter-dolkens commented 1 year ago

💡 Join our Slack Community to ask general questions, suggest ideas and get direct help from all the folks at Checkly.

Is your feature request related to a problem? Please describe. Some services are impacted by intermittent/transient disruptions which last longer than the inbuilt retry policy, but are often resolved before notifications reach our team.

We would like a way to distinguish an ongoing outage, vs a transient error.

The current auto-retry policy makes a second attempt too soon after the first attempt, and is often impacted by the same transient error.

This feature would potentially also allow more accurate tracking of time-based metrics, as check frequency could be increased during outages to get more accurate duration information.

Describe the solution you'd like I'd like to be able to have 2 distinct check rates - one for when everything is working as expected, and a separate one for when the check is in a fail state.

Describe alternatives you've considered

Additional context

peter-dolkens commented 1 year ago

CC @ebuna as this was your idea

tnolet commented 1 year ago

@peter-dolkens we are tackling this in a larger Alerting V2 project later this year. I'm adding this ticket to the overarching one https://github.com/orgs/checkly/projects/4/views/4?pane=issue&itemId=21238722

On the specific topics you mentioned: the all make sense to me. Please give this one also an upvote as I think it covers a lot of what you mentioned https://github.com/checkly/public-roadmap/issues/208