v-zhuravlev / zbx-smartctl

Templates and scripts for monitoring disks health with Zabbix and smartmontools
https://share.zabbix.com/storage-devices/smartmontools/smart-monitoring-with-smartmontools-lld
GNU General Public License v3.0
245 stars 127 forks source link

uHDD.health[*] doesn't detect failure with zabbix 3.4 #38

Closed sabelka closed 7 years ago

sabelka commented 7 years ago

When smartctl -H detects a drive failure it returns an error code greater 0. (see man smartctl section RETURN VALUES). The zabbix-agent treats this case as command failure and returns ZBX_NOTSUPPORTED to the server.

Possible fix: changing the UserParameter configuration to return a zero error code in all circumstances, e.g. with

UserParameter=uHDD.health[*],sudo smartctl -H $1 || true

instead of

UserParameter=uHDD.health[*],sudo smartctl -H $1
v-zhuravlev commented 7 years ago

Thanks! Can you share a STDOUT output of failed test?

sabelka commented 7 years ago

Sure! Here you go:

smartctl -H /dev/sdc

smartctl 6.2 2017-02-27 r4394 [x86_64-linux-3.10.0-693.2.2.el7.x86_64] (local build) Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: FAILED! Drive failure expected in less than 24 hours. SAVE ALL DATA. Failed Attributes: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 5 Reallocated_Sector_Ct 0x0033 009 009 010 Pre-fail Always FAILING_NOW 7461

echo $?

24

v-zhuravlev commented 7 years ago

https://github.com/v-zhuravlev/zbx-smartctl/tree/agent3.4 need to update Windows userparams probably too and test

v-zhuravlev commented 7 years ago

https://support.zabbix.com/browse/ZBX-12594