thomas-krenn / check_lsi_raid

Monitoring plugin to check MegaRAID controllers
GNU General Public License v3.0
59 stars 26 forks source link

No possibility to ignore ALL other errors - only a specific count #11

Closed flohoff closed 7 years ago

flohoff commented 7 years ago

Hi, we are having trouble with a Controller/Disk combination which whenever queried counts the other errors up. So we have at a check interval of 5 Minutes 288 Other Errors/day on that disk.

root@files02:~# storcli /c0/e25/s3 show all | grep Other
Other Error Count = 864
root@files02:~# storcli /c0/e25/s3 show all | grep Other 
Other Error Count = 865
root@files02:~# storcli /c0/e25/s3 show all | grep Other
Other Error Count = 866 
root@files02:~# storcli /c0/e25/s3 show all | grep Other
Other Error Count = 867
root@files02:~# storcli /c0/e25/s3 show all | grep Other
Other Error Count = 868

When i disvcovered the other errors i set the threshhold to -Io 800 which gave some time other the weekend, now the error came up again.

It would be nice to be able to ignore all other errors with -Io 0 or -Io -1 or something

Flo

tk-wfischer commented 7 years ago

Hi Flo,

thank you for reporting your issue.

Maybe the change of line number 619 https://github.com/thomas-krenn/check_lsi_raid/blob/master/check_lsi_raid#L619 could help: if(($IGNERR_O != -1) && ($PD->{'Other Error Count'} > $IGNERR_O)){ instead of the current if($PD->{'Other Error Count'} > $IGNERR_O){

In that way, setting -Io -1 should fix your issue.

Could you give this a try?

If it works, I'll include this in the code.

Best regards, Werner

flohoff commented 7 years ago

The proposed change works

gschoenberger commented 7 years ago

Done by commit 4b768805a82a2a94f8736d5ac2fbfd7c3222065a THX!