nobody43 / zabbix-smartmontools

Disk SMART monitoring for Linux, FreeBSD and Windows. LLD, trapper.
The Unlicense
54 stars 19 forks source link

scripts work within seconds but zabbix agent reports timeout #52

Open bschwand opened 3 months ago

bschwand commented 3 months ago

Describe the problem I am setting up zabbix-smartmontools on FreeBSD14 with zabbix7 and the agent always reports a timeout on executing the script, although timeout is already set to 30 and the script runs fast in testing.

To Reproduce install as explained in README.md change paths to python and config files in all scripts and configure files. (/usr/local/etc/zabbix7 ...) run on client and server the test commands as detailed in README.me (this succeeds and returns in 8 seconds) agent and server configuration Timeout are set to 30 in their respective conf files

Expected behavior The disks smart info is discovered. Instead, this appears on the agent log file

15372:20240620:212435.750 Failed to execute command " PATH=/usr/local/sbin:/usr/local/bin "/usr/local/etc/zabbix7/scripts/smartctl-lld.py" "get" "Zabbix server"": Timeout while executing a shell script.

Provide all outputs described in Testing step client.log get.log getverb.log

Please complete the following information:

redchairman commented 3 months ago

您的邮件我已收到,谢谢合作!

bschwand commented 3 months ago

I got a little further debugging this I tried setting DISK_DEVS_MANUAL = ['/dev/da0 -d auto'] and in this case I was able to test the discovery rule in zabbix, specifying agent ip and port. Then, zabbix was able to discover the lone drive. So it would seem there is indeed a timeout discovering all the drives, however I did change the timeout on both server and agent alread. Even if I change the timeout to 30 seconds, I get an error way before those 30 seconds. It's like zabbix is timing out way earlier.

nobody43 commented 3 months ago

Hi. How many disks do you have?

@redchairman Please translate

bschwand commented 3 months ago

ok I solved it the configuration parameter Timeout in the conf file for the server and for the agent do not do anything. Instead, the setting have to change through the zabbix UI: In Data Collection -> Templates bring up the smartmon tool template, click on its discovery element, then SMART disk dicovery that will bring the configuration for the discovery rule. Change Timeout -> Overrride to 15s (for me) also click on the link next to that box "Timeouts", this goes to a general timeout configuration page, change also the "Scripts" timeout to 15s.

Now it all works.

Thanks for this great template !

bschwand commented 3 months ago

Hi. How many disks do you have?

@redchairman Please translate

it's an auto reply saying thanks for the comment... might as well delete that comment

bschwand commented 3 months ago

I'll hijack my own question here how does one switch from device mode to serial ? I'd rather see the serial numbers as that is what I use as GPT labels including slot numbers, etc. and device can change across reboots...

nobody43 commented 3 months ago

Alright, glad you solved it. Apparently it's related to the newest Zabbix Server version, I'll investigate it further.

how does one switch from device mode to serial ?

First you Unlink and clear the template from the host, then change MODE in local file to serial, and finally reassign the template in Web UI.

bschwand commented 3 months ago

ah of course I missed that in the script. it was too obvious I guess. maybe add in the README that "choose serial or device mode by changing the MODE setting at the top of smartctl-lld.py" was not clear how to choose the mode :-)