Feature Request: check retry interval when in a non-ok state

sensu / sensu-go

Simple. Scalable. Multi-cloud monitoring.

https://sensu.io

MIT License

1.03k stars 175 forks source link

Feature Request: check retry interval when in a non-ok state #2925

Open rgeniesse opened 5 years ago

rgeniesse commented 5 years ago

Expected Behavior

After a check goes into a non-ok state, the check interval should have the ability to increase. This lets you have more confidence that there is actually an issue going on in a sense.

Current Behavior

Check interval is static without using either outside configuration management to increase the interval or a separate check/ check hook to call the API to increase the interval, then when resolved decrease the interval.

Possible Solution

Add the feature!

Context

See https://discourse.sensu.io/t/sensu-check-different-retry-interval-on-failure/1152 for the source of the FR.

I believe Nagios has a similar concept.

majormoses commented 5 years ago

I have seen this request come up multiple times in sensu ruby and now in sensu-go. I think it makes a lot of sense as you want to know once it reaches a healthy state in a timely manner even if there is a long interval. There are some performance considerations as say a core networking component suddenly fails and now you have n number of checks that are running at shorter intervals leading to increased load.

stale[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

nicolasbrechet commented 4 years ago

Hi,

This is a feature that would be very interesting for me... Going from icing to Sensu...

Any chance it could be considered for development ?

majormoses commented 4 years ago

I am not an employee of sensu, I would love this feature myself. It is a small but very welcome quality of life improvement for sensu.