sensu-plugins / sensu-plugins-cpu-checks

This plugin provides native CPU instrumentation for monitoring and metrics collection, including: CPU usage and metrics for user, nice, system, idle, iowait, irq, softirq, steal, and guest.
http://sensu-plugins.io
MIT License
13 stars 34 forks source link

[BREAKING CHANGES] update default `sleep` for `check-cpu.rb` #36

Closed majormoses closed 6 years ago

majormoses commented 6 years ago

The goal behind this is to provide overall more accurate CPU utilization results. The way we calculate this is to pull the current utilization, sleep n seconds, poll again, and then take a difference between the two numbers. From my tests at my last organization I found that increasing the sleep time gave more accurate overall results. There are several reasons for this but the main reason is that by the very act of calling up the ruby interpreter you are incuring additional cpu utilization. By increasing the poll interval you allow its utilization to even out from the initial load of the interpreter. 5 is by no means a magic number but I found at my previous employer that 5 was a very reasonable compromise for accuracy and still being able to schedule it to see very granular results. The more resource contrained the machine is the more value you will get out of higher sleep/poll values.

This is a breaking change as it updates the defaults and if you are scheduling this more frequently than 10 seconds (and do not set your own sleep value) you could very easily end up with multiple instances of the same check running which would only increase the problem.

From my tests locally I did not notice much difference (about 2-3% inflated) between the data returned from this check and top -d 5 values

Signed-off-by: Ben Abrams me@benabrams.it

Pull Request Checklist

closes #35

General

Test:

$ bundle exec ./bin/check-cpu.rb 
CheckCPU TOTAL OK: total=17.51 user=14.21 nice=0.0 system=3.12 idle=82.46 iowait=0.03 irq=0.0 softirq=0.18 steal=0.0 guest=0.0 guest_nice=0.0

Purpose

Provide a better default that produces smoother metric utilization picture by increasing the poll interval

Known Compatibility Issues

This changes the default poll/sleep value from 1 to 5 if you are scheduling your checks more often than 5 seconds this could make the problem even worst as they could start overlapping because of the scheduling. I find that most people I have seen are checking no more than once every 10 seconds and for those people they should be able to upgrade safely without modification.

majormoses commented 6 years ago

released: https://rubygems.org/gems/sensu-plugins-cpu-checks/versions/3.0.0