nobody43 / zabbix-smartmontools

Disk SMART monitoring for Linux, FreeBSD and Windows. LLD, trapper.
The Unlicense
54 stars 19 forks source link

LLD doesn't seem to work with zabbix 3.4.1 #5

Closed dizzy2 closed 7 years ago

dizzy2 commented 7 years ago

Low level discovery with smartmon tools doesn't seem to work on a fresh new install of latest zabbix 3.4.1. After eliminated all other sources of problems (SELinux rules), only single value from smartmontools appears in zabbix data: Template Configuration Status -> CONFIGURED Other values (device parameters, smart values,...) don't make it into Zabbix server. I have tried to start the lld python script manually (getverb parameter) - it had managed it to collect all the parameters from smart utils and sent it into zabbix server, but the zabbix server has refused all of them (except one - status). The template discovery rule doesn't seem to work for some kind of reason. Maybe there has been some update being done in recent version of zabbix which changes the behavior of template discovery, but I am not quite sure about the details.... Please let me know if You need some further information...

Thanks R

nobody43 commented 7 years ago

Thank you for the information. Thing is, the discovery can not be triggered by hand or by client - it only runs from zabbix server. After that all other items will work. By default its 6 hours, you can temporary decrease this parameter for testing in: template -> Discovery -> SMART disk discovery -> Update interval. Please make sure that LLD runs first, you can track it in /var/log/zabbix/zabbix_server.log. Check back if you still have problems.

dizzy2 commented 7 years ago

hi, thank You for Your reply. The problem is more complicated, I'm afraid. The update interval was one of the first things I have tried - I have set it down to 1 minute. Server log looks fine (trapper seems to be running): 17803:20170901:080148.510 server #23 started [trapper #1] 17806:20170901:080148.510 server #25 started [trapper #3] 17805:20170901:080148.511 server #24 started [trapper #2] 17807:20170901:080148.511 server #26 started [trapper #4] 17808:20170901:080148.512 server #27 started [trapper #5]

But now I am seeing something disturbing on the client side - following log entries appear on the client side (yes, the jsons are really incomplete) - could it be the problem, or is this just truncated by the logging engine?

15528:20170901:100234.575 { "data": [ { "{#DDRIVESTATUS}": "sda" }, { "{#DSERIAL}": "sda" }, { "{#DNAME}": "sda" }, { "{#DMODEL}": "sda" }, { "{#DFIRMWARE}": "sda" }, { "{#DCAPACITY}": "sda" }, { "{#DSELFTEST}": "sda" }, { "{#DSMARTSTATUS}": "sda" }, { "{#DVALUE}": "sda", "{#SMARTNAME}": "Raw_Read_Error_Rate", "{#SMARTID}": "1" }, { "{#DVALUE5}": "sda" }, { "{#DVALUE}": "sda", "{#SMARTNAME}": "Power_On_Hours", "{#SMARTID}": "9" }, { "{#DVALUE}": "sda", "{#SMARTNAME}": "Power_Cycle_Count", "{#SMARTID}": "12" }, { "{#DVALUE}": "sda", "{#SMARTNAME}": "Unknown_Attribute", "{#SMARTID}": "100" }, { "{#DVALUE}": "sda", "{#SMARTNAME}": "Unknown_Attribute", "{#SMARTID}": "101" }, { "{#DVALUE}": "sda", "{#SMARTNAME}": "Unknown_Attribute", "{#SMARTID}": "170" }, { "{#DVALUE}": "sda", "{#SMARTNAME}": "Unknown_Attribute", "{#SMARTID}": "171" }, { "{#DVALUE}": "sda", "{#SMARTNAME}": "Unknown_Attribute", "{#SMARTID}": "172" }, { "{#DVALUE}": "sda", "{#SMARTNAME}": "Unknown_Attribute", "{#SMARTID}": "174" }, { "{#DVALUE}": "sda", "{#SMARTNAME}": "Program_Fail_Count_Chip", "{#SMARTID}": "175" }, { "{#DVALUE}": "sda", "{#SMARTNAME}": "Erase_Fail_Count_Chip", "{#SMARTID}": "176" }, { "{#DVALUE}": "sda", "{#SMARTN 15528:20170901:101234.263 { "data": [ { "{#DDRIVESTATUS}": "sda" ....

nobody43 commented 7 years ago

Please provide version of the zabbix agent. Also, what error are you seeing in host - Discovery - SMART disk discovery - Info?

dizzy2 commented 7 years ago

Zabbix agent version is 3.4.1 (revision 71734). For Your second question - there seems to be no "Info" page under Discovery rule definition (I can't find such at least). But the system reports no error (neither in (both) logs nor anywhere in the zabbix app). Just a single item entry smartctl.info[ConfigStatus] having value "CONFIGURED"

nobody43 commented 7 years ago

Anything similar? http://i.imgur.com/bguBydB.png Info from zabbix_server.log would help.

dizzy2 commented 7 years ago

well, not quite: http://imgur.com/a/3ru7O

dump server log follows: 17769:20170901:080148.435 Starting Zabbix Server. Zabbix 3.4.1 (revision 71734). 17769:20170901:080148.435 ** Enabled features ** 17769:20170901:080148.435 SNMP monitoring: YES 17769:20170901:080148.435 IPMI monitoring: YES 17769:20170901:080148.435 Web monitoring: YES 17769:20170901:080148.435 VMware monitoring: YES 17769:20170901:080148.435 SMTP authentication: YES 17769:20170901:080148.435 Jabber notifications: YES 17769:20170901:080148.435 Ez Texting notifications: YES 17769:20170901:080148.435 ODBC: YES 17769:20170901:080148.435 SSH2 support: YES 17769:20170901:080148.436 IPv6 support: YES 17769:20170901:080148.436 TLS support: YES 17769:20170901:080148.436 ** 17769:20170901:080148.436 using configuration file: /etc/zabbix/zabbix_server.conf 17769:20170901:080148.453 current database version (mandatory/optional): 03040000/03040000 17769:20170901:080148.453 required mandatory version: 03040000 17769:20170901:080148.494 server #0 started [main process] 17773:20170901:080148.495 server #1 started [configuration syncer #1] 17774:20170901:080148.495 server #2 started [alerter #1] 17775:20170901:080148.495 server #3 started [alerter #2] 17776:20170901:080148.496 server #4 started [alerter #3] 17777:20170901:080148.496 server #5 started [housekeeper #1] 17778:20170901:080148.496 server #6 started [timer #1] 17779:20170901:080148.497 server #7 started [http poller #1] 17780:20170901:080148.497 server #8 started [discoverer #1] 17783:20170901:080148.497 server #9 started [history syncer #1] 17784:20170901:080148.499 server #10 started [history syncer #2] 17787:20170901:080148.500 server #12 started [history syncer #4] 17789:20170901:080148.501 server #13 started [escalator #1] 17786:20170901:080148.502 server #11 started [history syncer #3] 17790:20170901:080148.502 server #14 started [proxy poller #1] 17794:20170901:080148.505 server #17 started [poller #1] 17799:20170901:080148.505 server #21 started [poller #5] 17793:20170901:080148.505 server #16 started [task manager #1] 17796:20170901:080148.506 server #19 started [poller #3] 17798:20170901:080148.507 server #20 started [poller #4] 17795:20170901:080148.507 server #18 started [poller #2] 17801:20170901:080148.509 server #22 started [unreachable poller #1] 17803:20170901:080148.510 server #23 started [trapper #1] 17806:20170901:080148.510 server #25 started [trapper #3] 17810:20170901:080148.510 server #29 started [alert manager #1] 17805:20170901:080148.511 server #24 started [trapper #2] 17807:20170901:080148.511 server #26 started [trapper #4] 17809:20170901:080148.512 server #28 started [icmp pinger #1] 17792:20170901:080148.512 server #15 started [self-monitoring #1] 17808:20170901:080148.512 server #27 started [trapper #5] 17811:20170901:080148.513 server #30 started [preprocessing manager #1] 17814:20170901:080149.114 server #33 started [preprocessing worker #3] 17812:20170901:080149.114 server #31 started [preprocessing worker #1] 17813:20170901:080149.114 server #32 started [preprocessing worker #2] 17777:20170901:083148.737 executing housekeeper 17777:20170901:083148.901 housekeeper [deleted 0 hist/trends, 0 items, 0 events, 0 problems, 0 sessions, 0 alarms, 0 audit items in 0.156187 sec, idle for 1 hour(s)] 17777:20170901:093149.376 executing housekeeper 17777:20170901:093149.392 housekeeper [deleted 0 hist/trends, 0 items, 0 events, 0 problems, 0 sessions, 0 alarms, 0 audit items in 0.008245 sec, idle for 1 hour(s)] 17777:20170901:103149.870 executing housekeeper 17777:20170901:103149.885 housekeeper [deleted 0 hist/trends, 0 items, 0 events, 0 problems, 0 sessions, 0 alarms, 0 audit items in 0.008052 sec, idle for 1 hour(s)] 17777:20170901:113150.352 executing housekeeper 17777:20170901:113150.367 housekeeper [deleted 0 hist/trends, 0 items, 0 events, 0 problems, 0 sessions, 0 alarms, 0 audit items in 0.008124 sec, idle for 1 hour(s)] 17777:20170901:123150.846 executing housekeeper 17777:20170901:123150.861 housekeeper [deleted 0 hist/trends, 0 items, 0 events, 0 problems, 0 sessions, 0 alarms, 0 audit items in 0.008094 sec, idle for 1 hour(s)] 17777:20170901:133151.336 executing housekeeper 17777:20170901:133151.351 housekeeper [deleted 0 hist/trends, 0 items, 0 events, 0 problems, 0 sessions, 0 alarms, 0 audit items in 0.008264 sec, idle for 1 hour(s)]

nobody43 commented 7 years ago

That page screen please http://i.imgur.com/R5Jjwb6.png

dizzy2 commented 7 years ago

http://imgur.com/a/8PCiB

nobody43 commented 7 years ago

I will setup the environment and will try to fix this issue, within a week maybe. For now, please try to catch specific LLD reply in zabbix_server.log. It may require changing log level in server config.

dizzy2 commented 7 years ago

Hi again - set debug level up to max (5) and got following entries: 21876:20170901:214334.341 trapper got '{"request":"sender data","data":[{"host":"atlas","key":"smartctl.info[sda,DriveStatus]","value":"PROCESSED"},{"host":"atlas","key":"smartctl.info[sda,serial]","value":"50026B766901C962"},{"host":"atlas","key":"smartctl.info[sda,device]","value":"sda"},{"host":"atlas","key":"smartctl.info[sda,model]","value":"KINGSTON SUV400S37120G"},{"host":"atlas","key":"smartctl.info[sda,firmware]","value":"0C3J96R9"},{"host":"atlas","key":"smartctl.info[sda,capacity]","value":"120034123776"},{"host":"atlas","key":"smartctl.info[sda,selftest]","value":"PASSED"},{"host":"atlas","key":"smartctl.info[sda,SmartStatus]","value":"PRESENT"},{"host":"atlas","key":"smartctl.value[sda,1]","value":"0"},{"host":"atlas","key":"smartctl.value[sda,5]","value":"0"},{"host":"atlas","key":"smartctl.value[sda,9]","value":"6442"},{"host":"atlas","key":"smartctl.value[sda,12]","value":"7"},{"host":"atlas","key":"smartctl.value[sda,100]","value":"2601424"},{"host":"atlas","key":"smartctl.value[sda,101]","value":"845840"},{"host":"atlas","key":"smartctl.value[sda,170]","value":"0"},{"host":"atlas","key":"smartctl.value[sda,171]","value":"0"},{"host":"atlas","key":"smartctl.value[sda,172]","value":"0"},{"host":"atlas","key":"smartctl.value[sda,174]","value":"6"},{"host":"atlas","key":"smartctl.value[sda,175]","value":"0"},{"host":"atlas","key":"smartctl.value[sda,176]","value":"0"},{"host":"atlas","key":"smartctl.value[sda,177]","value":"1563"},{"host":"atlas","key":"smartctl.value[sda,178]","value":"0"},{"host":"atlas","key":"smartctl.value[sda,180]","value":"709"},{"host":"atlas","key":"smartctl.value[sda,183]","value":"39"},{"host":"atlas","key":"smartctl.value[sda,187]","value":"0"},{"host":"atlas","key":"smartctl.value[sda,194]","value":"31"},{"host":"atlas","key":"smartctl.value[sda,195]","value":"0"},{"host":"atlas","key":"smartctl.value[sda,196]","value":"0"},{"host":"atlas","key":"smartctl.value[sda,197]","value":"0"},{"host":"atlas","key":"smartctl.value[sda,199]","value":"0"},{"host":"atlas","key":"smartctl.value[sda,201]","value":"0"},{"host":"atlas","key":"smartctl.value[sda,204]","value":"0"},{"host":"atlas","key":"smartctl.value[sda,231]","value":"6"},{"host":"atlas","key":"smartctl.value[sda,233]","value":"4832"},{"host":"atlas","key":"smartctl.value[sda,234]","value":"4827"},{"host":"atlas","key":"smartctl.value[sda,241]","value":"4002"},{"host":"atlas","key":"smartctl.value[sda,242]","value":"589"},{"host":"atlas","key":"smartctl.value[sda,250]","value":"0"},{"host":"atlas","key":"smartctl.info[sdb,DriveStatus]","value":"PROCESSED"},{"host":"atlas","key":"smartctl.info[sdb,serial]","value":"WD-WCAZA4674461"},{"host":"atlas","key":"smartctl.info[sdb,device]","value":"sdb"},{"host":"atlas","key":"smartctl.info[sdb,model]","value":"WDC WD20EARS-00MVWB0"},{"host":"atlas","key":"smartctl.info[sdb,firmware]","value":"51.0AB51"},{"host":"atlas","key":"smartctl.info[sdb,capacity]","value":"2000398934016"},{"host":"atlas","key":"smartctl.info[sdb,selftest]","value":"PASSED"},{"host":"atlas","key":"smartctl.info[sdb,SmartStatus]","value":"PRESENT"},{"host":"atlas","key":"smartctl.value[sdb,1]","value":"3"},{"host":"atlas","key":"smartctl.value[sdb,3]","value":"1041"},{"host":"atlas","key":"smartctl.value[sdb,4]","value":"222"},{"host":"atlas","key":"smartctl.value[sdb,5]","value":"0"},{"host":"atlas","key":"smartctl.value[sdb,7]","value":"0"},{"host":"atlas","key":"smartctl.value[sdb,9]","value":"47797"},{"host":"atlas","key":"smartctl.value[sdb,10]","value":"0"},{"host":"atlas","key":"smartctl.value[sdb,11]","value":"0"},{"host":"atlas","key":"smartctl.value[sdb,12]","value":"220"},{"host":"atlas","key":"smartctl.value[sdb,192]","value":"147"},{"host":"atlas","key":"smartctl.value[sdb,193]","value":"2516516"},{"host":"atlas","key":"smartctl.value[sdb,194]","value":"36"},{"host":"atlas","key":"smartctl.value[sdb,196]","value":"0"},{"host":"atlas","key":"smartctl.value[sdb,197]","value":"0"},{"host":"atlas","key":"smartctl.value[sdb,198]","value":"0"},{"host":"atlas","key":"smartctl.value[sdb,199]","value":"353"},{"host":"atlas","key":"smartctl.value[sdb,200]","value":"12"},{"host":"atlas","key":"smartctl.info[sdc,DriveStatus]","value":"PROCESSED"},{"host":"atlas","key":"smartctl.info[sdc,serial]","value":"WD-WCAZA9811934"},{"host":"atlas","key":"smartctl.info[sdc,device]","value":"sdc"},{"host":"atlas","key":"smartctl.info[sdc,model]","value":"WDC WD20EARS-00MVWB0"},{"host":"atlas","key":"smartctl.info[sdc,firmware]","value":"51.0AB51"},{"host":"atlas","key":"smartctl.info[sdc,capacity]","value":"2000398934016"},{"host":"atlas","key":"smartctl.info[sdc,selftest]","value":"PASSED"},{"host":"atlas","key":"smartctl.info[sdc,SmartStatus]","value":"PRESENT"},{"host":"atlas","key":"smartctl.value[sdc,1]","value":"0"},{"host":"atlas","key":"smartctl.value[sdc,3]","value":"4583"},{"host":"atlas","key":"smartctl.value[sdc,4]","value":"188"},{"host":"atlas","key":"smartctl.value[sdc,5]","value":"0"},{"host":"atlas","key":"smartctl.value[sdc,7]","value":"0"},{"host":"atlas","key":"smartctl.value[sdc,9]","value":"42807"},{"host":"atlas","key":"smartctl.value[sdc,10]","value":"0"},{"host":"atlas","key":"smartctl.value[sdc,11]","value":"0"},{"host":"atlas","key":"smartctl.value[sdc,12]","value":"186"},{"host":"atlas","key":"smartctl.value[sdc,192]","value":"134"},{"host":"atlas","key":"smartctl.value[sdc,193]","value":"2419470"},{"host":"atlas","key":"smartctl.value[sdc,194]","value":"36"},{"host":"atlas","key":"smartctl.value[sdc,196]","value":"0"},{"host":"atlas","key":"smartctl.value[sdc,197]","value":"0"},{"host":"atlas","key":"smartctl.value[sdc,198]","value":"0"},{"host":"atlas","key":"smartctl.value[sdc,199]","value":"799"},{"host":"atlas","key":"smartctl.value[sdc,200]","value":"0"},{"host":"atlas","key":"smartctl.info[sdd,DriveStatus]","value":"PROCESSED"},{"host":"atlas","key":"smartctl.info[sdd,serial]","value":"WD-WCC5D3ZULH00"},{"host":"atlas","key":"smartctl.info[sdd,device]","value":"sdd"},{"host":"atlas","key":"smartctl.info[sdd,model]","value":"WDC WD4001FFSX-68JNUN0"},{"host":"atlas","key":"smartctl.info[sdd,firmware]","value":"81.00A81"},{"host":"atlas","key":"smartctl.info[sdd,capacity]","value":"4000787030016"},{"host":"atlas","key":"smartctl.info[sdd,rpm]","value":"7200"},{"host":"atlas","key":"smartctl.info[sdd,selftest]","value":"PASSED"},{"host":"atlas","key":"smartctl.info[sdd,SmartStatus]","value":"PRESENT"},{"host":"atlas","key":"smartctl.value[sdd,1]","value":"0"},{"host":"atlas","key":"smartctl.value[sdd,3]","value":"8250"},{"host":"atlas","key":"smartctl.value[sdd,4]","value":"43"},{"host":"atlas","key":"smartctl.value[sdd,5]","value":"0"},{"host":"atlas","key":"smartctl.value[sdd,7]","value":"0"},{"host":"atlas","key":"smartctl.value[sdd,9]","value":"17387"},{"host":"atlas","key":"smartctl.value[sdd,10]","value":"0"},{"host":"atlas","key":"smartctl.value[sdd,11]","value":"0"},{"host":"atlas","key":"smartctl.value[sdd,12]","value":"43"},{"host":"atlas","key":"smartctl.value[sdd,16]","value":"9801814056083"},{"host":"atlas","key":"smartctl.value[sdd,183]","value":"4"},{"host":"atlas","key":"smartctl.value[sdd,192]","value":"32"},{"host":"atlas","key":"smartctl.value[sdd,193]","value":"3772"},{"host":"atlas","key":"smartctl.value[sdd,194]","value":"43"},{"host":"atlas","key":"smartctl.value[sdd,196]","value":"0"},{"host":"atlas","key":"smartctl.value[sdd,197]","value":"0"},{"host":"atlas","key":"smartctl.value[sdd,198]","value":"0"},{"host":"atlas","key":"smartctl.value[sdd,199]","value":"0"},{"host":"atlas","key":"smartctl.value[sdd,200]","value":"0"},{"host":"atlas","key":"smartctl.info[sde,DriveStatus]","value":"ERR_CODE_64"},{"host":"atlas","key":"smartctl.info[sde,serial]","value":"WD-WCC13A7TYE9C"},{"host":"atlas","key":"smartctl.info[sde,device]","value":"sde"},{"host":"atlas","key":"smartctl.info[sde,model]","value":"WDC WD4000F9YZ-09N20L0"},{"host":"atlas","key":"smartctl.info[sde,firmware]","value":"01.01A01"},{"host":"atlas","key":"smartctl.info[sde,capacity]","value":"4000787030016"},{"host":"atlas","key":"smartctl.info[sde,rpm]","value":"7200"},{"host":"atlas","key":"smartctl.info[sde,selftest]","value":"PASSED"},{"host":"atlas","key":"smartctl.info[sde,SmartStatus]","value":"PRESENT"},{"host":"atlas","key":"smartctl.value[sde,1]","value":"0"},{"host":"atlas","key":"smartctl.value[sde,3]","value":"8708"},{"host":"atlas","key":"smartctl.value[sde,4]","value":"63"},{"host":"atlas","key":"smartctl.value[sde,5]","value":"0"},{"host":"atlas","key":"smartctl.value[sde,7]","value":"0"},{"host":"atlas","key":"smartctl.value[sde,9]","value":"23151"},{"host":"atlas","key":"smartctl.value[sde,10]","value":"0"},{"host":"atlas","key":"smartctl.value[sde,11]","value":"0"},{"host":"atlas","key":"smartctl.value[sde,12]","value":"58"},{"host":"atlas","key":"smartctl.value[sde,183]","value":"1790"},{"host":"atlas","key":"smartctl.value[sde,192]","value":"39"},{"host":"atlas","key":"smartctl.value[sde,193]","value":"23"},{"host":"atlas","key":"smartctl.value[sde,194]","value":"41"},{"host":"atlas","key":"smartctl.value[sde,196]","value":"0"},{"host":"atlas","key":"smartctl.value[sde,197]","value":"0"},{"host":"atlas","key":"smartctl.value[sde,198]","value":"0"},{"host":"atlas","key":"smartctl.value[sde,199]","value":"3"},{"host":"atlas","key":"smartctl.value[sde,200]","value":"0"},{"host":"atlas","key":"smartctl.info[ConfigStatus]","value":"CONFIGURED"}]}' 21876:20170901:214334.341 In recv_senderhistory() 21876:20170901:214334.341 In process_client_history_data() 21876:20170901:214334.342 In parse_history_data() 21876:20170901:214334.343 End of parse_history_data():SUCCEED processed:144/144 21876:20170901:214334.344 In substitute_simple_macros() data:EMPTY 21876:20170901:214334.344 In process_history_data() 21876:20170901:214334.344 In zbx_preprocess_item_value() 21876:20170901:214334.344 End of zbx_preprocess_item_value() 21876:20170901:214334.344 In zbx_ipc_socket_open() 21876:20170901:214334.344 End of zbx_ipc_socket_open():SUCCEED 21876:20170901:214334.344 In zbx_ipc_socket_write() 21882:20170901:214334.344 In ipc_service_accept() 21876:20170901:214334.344 End of zbx_ipc_socket_write():SUCCEED 21882:20170901:214334.344 In ipc_service_add_client() 21876:20170901:214334.345 End of process_history_data() processed:1 21882:20170901:214334.345 End of ipc_service_add_client() clientid:11 21876:20170901:214334.345 End of process_client_history_data():SUCCEED 21882:20170901:214334.345 End of ipc_service_accept():SUCCEED 21876:20170901:214334.345 In zbx_send_response() 21882:20170901:214334.345 End of zbx_ipc_service_recv():2 21876:20170901:214334.345 zbx_send_response() '{"response":"success","info":"processed: 1; failed: 143; total: 144; seconds spent: 0.003241"}' 21882:20170901:214334.345 In zbx_ipc_service_recv() timeout:1 21876:20170901:214334.345 End of zbx_send_response():SUCCEED 21882:20170901:214334.345 zbx_ipc_service_recv() code:2 size:80 data:d2 6e 00 00 00 00 00 00 | 00 00 00 00 00 00 01 86 | d4 a9 59 bb 5b 7e 14 01 | 00 00 00 00 00 00 00 00 | 00 00 00 00 00 00 00 00 | 00 00 00 00 00 00 00 00 | 00 00 00 00 0b 00 00 00 | 43 4f 4e 46 49 47 55 52 | 45 44 00 00 00 00 00 08 | 00 00 00 00 00 00 00 00 21876:20170901:214334.346 End of recv_senderhistory() 21882:20170901:214334.346 End of zbx_ipc_service_recv():1 21882:20170901:214334.346 In preprocessor_add_request() 21882:20170901:214334.346 In preprocessor_sync_configuration() 21876:20170901:214334.346 __zbx_zbx_setproctitle() title:'trapper #2 [processed data in 0.004720 sec, waiting for connection]' 21882:20170901:214334.346 End of preprocessor_sync_configuration() item config size: 26, history cache size: 24

Does it help?

nobody43 commented 7 years ago

Its useful, but not what I need. Its sender data, but what's needed is what comes just before it (60 sec by default) - LLD json or reaction to it. (edit the serials)

dizzy2 commented 7 years ago

OK, found something else - if it's not what we are looking for, give me some hit (keyword, or something I should look for), please.

21870:20170901:214233.555 In substitute_key_macros() data:'smartctl.discovery[get,{HOST.HOST}]' 21870:20170901:214233.555 In substitute_simple_macros() data:'{HOST.HOST}' 21870:20170901:214233.555 End substitute_simple_macros() data:'atlas' 21871:20170901:214233.555 zbx_zbx_setproctitle() title:'poller #3 [got 0 values in 0.001205 sec, getting values]' 21870:20170901:214233.555 End of substitute_key_macros():SUCCEED data:'smartctl.discovery[get,atlas]' 21871:20170901:214233.556 In get_values() 21870:20170901:214233.556 In substitute_simple_macros() data:'10050' 21871:20170901:214233.556 In DCconfig_get_poller_items() poller_type:0 21872:20170901:214233.556 Sending [system.cpu.util[,user] ] 21870:20170901:214233.556 In get_value() key:'smartctl.discovery[get,{HOST.HOST}]' 21871:20170901:214233.556 End of DCconfig_get_poller_items():0 21870:20170901:214233.556 In get_value_agent() host:'atlas' addr:'atlas' key:'smartctl.discovery[get,atlas]' conn:'unencrypted' 21872:20170901:214233.556 get value from agent result: '0.372284' 21871:20170901:214233.556 In DCconfig_get_poller_nextcheck() poller_type:0 21872:20170901:214233.556 End of get_value_agent():SUCCEED 21871:20170901:214233.556 End of DCconfig_get_poller_nextcheck():1504302154 21872:20170901:214233.556 End of get_value():SUCCEED 21871:20170901:214233.557 End of get_values():0 21872:20170901:214233.557 In zbx_activate_item_host() hostid:10252 itemid:28233 type:0 21871:20170901:214233.557 __zbx_zbx_setproctitle() title:'poller #3 [got 0 values in 0.001268 sec, idle 1 sec]' 21872:20170901:214233.557 End of zbx_activate_item_host() 21872:20170901:214233.557 In zbx_preprocess_item_value() 21872:20170901:214233.557 End of zbx_preprocess_item_value() 21872:20170901:214233.557 In zbx_ipc_socket_write() 21872:20170901:214233.557 End of zbx_ipc_socket_write():SUCCEED 21872:20170901:214233.557 End of get_values():1 21882:20170901:214233.557 zbx_ipc_service_recv() code:2 size:78 data:49 6e 00 00 00 00 00 00 | 00 00 00 00 00 00 01 49 | d4 a9 59 0f 4d 33 21 01 | 00 00 00 00 00 00 00 00 | 00 00 00 00 00 00 00 00 | 00 00 00 00 00 00 00 00 | 00 00 00 00 09 00 00 00 | 30 2e 33 37 32 32 38 34 | 00 00 00 00 00 08 00 00 | 00 00 00 00 00 00 21872:20170901:214233.557 zbx_zbx_setproctitle() title:'poller #4 [got 1 values in 0.003338 sec, idle 1 sec]' 21869:20170901:214233.557 zbx_zbx_setproctitle() title:'poller #1 [got 1 values in 0.005335 sec, getting values]' 21882:20170901:214233.557 End of zbx_ipc_service_recv():1 21870:20170901:214233.557 Sending [smartctl.discovery[get,atlas] ] 21869:20170901:214233.558 In get_values() 21882:20170901:214233.558 In preprocessor_add_request() 21869:20170901:214233.558 In DCconfig_get_poller_items() poller_type:0 21882:20170901:214233.558 In preprocessor_sync_configuration() 21869:20170901:214233.558 End of DCconfig_get_poller_items():0 21882:20170901:214233.558 End of preprocessor_sync_configuration() item config size: 26, history cache size: 24 21869:20170901:214233.558 In DCconfig_get_poller_nextcheck() poller_type:0 21882:20170901:214233.558 In preprocessor_enqueue() itemid: 28233 21869:20170901:214233.558 End of DCconfig_get_poller_nextcheck():1504302154 21882:20170901:214233.558 In preprocessor_enqueue_dependent() itemid: 28233 21869:20170901:214233.558 End of get_values():0 21882:20170901:214233.559 End of preprocessor_enqueue_dependent() 21869:20170901:214233.559 zbx_zbx_setproctitle() title:'poller #1 [got 0 values in 0.001175 sec, idle 1 sec]' 21882:20170901:214233.559 End of preprocessor_enqueue() 21882:20170901:214233.559 In preprocessor_assign_tasks() 21882:20170901:214233.559 In preprocessor_get_queued_item() 21882:20170901:214233.559 End of preprocessor_get_queued_item() 21882:20170901:214233.559 End of preprocessor_assign_tasks() 21882:20170901:214233.559 End of preprocessor_add_request() 21882:20170901:214233.559 In zbx_ipc_service_recv() timeout:1 21881:20170901:214233.846 End of zbx_ipc_service_recv():2 21881:20170901:214233.846 In am_db_flush_alert_updates() updates:0 21881:20170901:214233.846 End of am_db_flush_alert_updates():SUCCEED 21881:20170901:214233.846 In am_db_queue_alerts() 21881:20170901:214233.846 In am_db_get_alerts() 21881:20170901:214233.846 query [txnlev:0] [select a.alertid,a.mediatypeid,a.sendto,a.subject,a.message,a.status,a.retries,e.source,e.object,e.objectid from alerts a left join events e on a.eventid=e.eventid where alerttype=0 and a.status=3 order by a.alertid] 21881:20170901:214233.847 End of am_db_get_alerts() alerts:0 21881:20170901:214233.847 End of am_db_queue_alerts():SUCCEED 21881:20170901:214233.848 In zbx_ipc_service_recv() timeout:1 21873:20170901:214233.873 zbx_zbx_setproctitle() title:'poller #5 [got 0 values in 0.000247 sec, getting values]' 21873:20170901:214233.874 In get_values() 21873:20170901:214233.874 In DCconfig_get_poller_items() poller_type:0 21873:20170901:214233.874 End of DCconfig_get_poller_items():0 21873:20170901:214233.874 In DCconfig_get_poller_nextcheck() poller_type:0 21873:20170901:214233.874 End of DCconfig_get_poller_nextcheck():1504302154 21873:20170901:214233.874 End of get_values():0 21873:20170901:214233.874 zbx_zbx_setproctitle() title:'poller #5 [got 0 values in 0.000317 sec, idle 1 sec]' 21866:20170901:214233.985 zbx_zbx_setproctitle() title:'proxy poller #1 [exchanged data with 0 proxies in 0.000180 sec, exchanging data]' 21866:20170901:214233.985 In process_proxy() 21866:20170901:214233.986 In DCconfig_get_proxypoller_hosts() 21866:20170901:214233.986 End of DCconfig_get_proxypoller_hosts():0 21866:20170901:214233.986 End of process_proxy() 21866:20170901:214233.986 In DCconfig_get_proxypoller_nextcheck() 21866:20170901:214233.986 End of DCconfig_get_proxypoller_nextcheck():-1 21866:20170901:214233.986 zbx_zbx_setproctitle() title:'proxy poller #1 [exchanged data with 0 proxies in 0.000187 sec, idle 5 sec]' 21880:20170901:214233.989 zbx_zbx_setproctitle() title:'icmp pinger #1 [getting values]' 21880:20170901:214233.990 In get_pinger_hosts() 21880:20170901:214233.990 In DCconfig_get_poller_items() poller_type:3 21880:20170901:214233.990 End of DCconfig_get_poller_items():0 21880:20170901:214233.990 End of get_pinger_hosts():0 21880:20170901:214233.990 In process_pinger_hosts() 21880:20170901:214233.990 End of process_pinger_hosts() 21880:20170901:214233.990 In DCconfig_get_poller_nextcheck() poller_type:3 21880:20170901:214233.990 End of DCconfig_get_poller_nextcheck():-1 21880:20170901:214233.990 zbx_zbx_setproctitle() title:'icmp pinger #1 [got 0 values in 0.000255 sec, idle 5 sec]' 21874:20170901:214233.994 __zbx_zbx_setproctitle() title:'unreachable poller #1 [got 0 values in 0.000310 sec, getting values]' 21874:20170901:214233.994 In get_values()

nobody43 commented 7 years ago

A guess: specify one disk manually: diskListManual = [ '/dev/sda' ]

Garincho commented 7 years ago

Hello. I have the same problem.

root@zabbix2:# zabbix_get -s 192.168.3.26 -k smartctl.discovery[get,"Zabbix Server"]
ZBX_NOTSUPPORTED: {
    "data": [
        {
            "{#DDRIVESTATUS}": "sda"
        },
        {
            "{#DSERIAL}": "sda"
        },
        {
            "{#DNAME}": "sda"
        },
        {
            "{#DMODEL}": "sda"
        },
        {
            "{#DFIRMWARE}": "sda"
        },
        {
            "{#DCAPACITY}": "sda"
        },
        {
            "{#DSELFTEST}": "sda"
        },
        {
            "{#DSMARTSTATUS}": "sda"
        },
        {
            "{#SMARTNAME}": "Raw_Read_Error_Rate",
            "{#SMARTID}": "1",
            "{#DVALUE}": "sda"
        },
        {
            "{#SMARTNAME}": "Throughput_Performance",
            "{#SMARTID}": "2",
            "{#DVALUE}": "sda"
        },
        {
            "{#SMARTNAME}": "Spin_Up_Time",
            "{#SMARTID}": "3",
            "{#DVALUE}": "sda"
        },
        {
            "{#SMARTNAME}": "Start_Stop_Count",
            "{#SMARTID}": "4",
            "{#DVALUE}": "sda"
        },
        {
            "{#DVALUE5}": "sda"
        },
        {
            "{#SMARTNAME}": "Seek_Error_Rate",
            "{#SMARTID}": "7",
            "{#DVALUE}": "sda"
        },
        {
            "{#SMARTNAME}": "Seek_Time_Performance",
            "{#SMARTID}": "8",
            "{#DVALUE}": "sda"
        },
        {
            "{#SMARTNAME}": "Power_On_Hours",
            "{#SMARTID}": "9",
            "{#DVALUE}": "sda"
        },
        {
            "{#SMARTNAME}": "Spin_Retry_Count",
            "{#SMARTID}": "10",
            "{#DVALUE}": "sda"
        },
        {
            "{#SMARTNAME}": "Power_Cycle_Count",
            "{#SMARTID}": "12",
            "{#DVALUE}": "sda"
        },
        {
            "{#SMARTNAME}": "Power-Off_Retract_Count",
            "{#SMARTID}": "192",
            "{#DVALUE}": "sda"
        },
        {
            "{#SMARTNAME}": "Load_Cycle_Count",
            "{#SMARTID}": "193",
            "{#DVALUE}": "sda"
        },
        {
            "{#SMARTNAME}": "Temperature_Celsius",
            "{#SMA
root@zabbix2:#

root@zabbix2:# cat /etc/zabbix/scripts/smartctl-lld.py
#!/usr/bin/env python3

## Installation instructions: https://github.com/nobodysu/zabbix-smartmontools

mode = 'device'   # 'device' or 'serial' as primary identifier in zabbix item's name

ctlPath = r'smartctl'
#ctlPath = r'C:\Program Files\smartmontools\bin\smartctl.exe'   # if smartctl isn't in PATH
#ctlPath = r'/usr/local/sbin/smartctl'

# path to second send script
senderPyPath = r'/etc/zabbix/scripts/smartctl-send.py'              # Linux
#senderPyPath = r'C:\zabbix-agent\scripts\smartctl-send.py'         # Win
#senderPyPath = r'/usr/local/etc/zabbix/scripts/smartctl-send.py'   # BSD

# path to zabbix agent configuration file
agentConf = r'/etc/zabbix/zabbix_agentd.conf'                       # Linux
#agentConf = r'C:\zabbix_agentd.conf'                               # Win
#agentConf = r'/usr/local/etc/zabbix24/zabbix_agentd.conf'          # BSD

senderPath = r'zabbix_sender'                                       # Linux, BSD
#senderPath = r'C:\zabbix-agent\bin\win32\zabbix_sender.exe'        # Win

timeout = '360'   # how long the script must wait between LLD and sending, increase if data received late (does not affect windows)

# manually provide disk list or RAID configuration if needed
#diskListManual = []
diskListManual = [ '/dev/sda' ]
# like this:
#diskListManual = ['/dev/sda -d sat+megaraid,4', '/dev/sda -d sat+megaraid,5']
# more info: https://www.smartmontools.org/wiki/Supported_RAID-Controllers

## End of configuration ##

root@zabbix2:~# dpkg --list | grep zabbix
ii  zabbix-agent                  1:3.4.1-1+stretch              amd64        Zabbix network monitoring solution - agent
ii  zabbix-frontend-php           1:3.4.1-1+stretch              all          Zabbix network monitoring solution - PHP front-end
ii  zabbix-get                    1:3.4.1-1+stretch              amd64        Zabbix network monitoring solution - get
ii  zabbix-release                3.4-1+stretch                  all          Zabbix official repository configuration
ii  zabbix-server-pgsql           1:3.4.1-1+stretch              amd64        Zabbix network monitoring solution - server (PostgreSQL)

root@zabbix2:~# grep Timeout < /etc/zabbix/zabbix_agentd.conf
### Option: Timeout
#       Spend no more than Timeout seconds on processing
# Timeout=3
Timeout=30
dizzy2 commented 7 years ago

@nobodysu - well updating diskListManual in smartctl-lld.py script, as You suggested, didn't help. Please let me know, if You need some more information. Thanks

nobody43 commented 7 years ago

Was able to narrow it down. The problem is in zabbix agent 3.4.x. All other versions works fine. I'm thinking that's a bug - 3.4 is brand new after all. Suggesting to use agent that present in your distro repo instead of zabbix repo. 2.4, 3.0, 3.2 are tested and working. Will look closer until bug is confirmed.

UPD: timeout value in smartctl-lld.py must be lower than discovery rule interval

nobody43 commented 7 years ago

@Garincho Its possible you have a different problem - your agent on host Zabbix server reports ZBX_NOTSUPPORTED this means that host agent does not know anything about item smartctl.discovery[get,HOST.HOST]. Is userparameter_smartctl.conf placed in zabbix_agentd.d folder? Is this folder included in main zabbix_agentd.conf? Have you restarted the agent?

dizzy2 commented 7 years ago

@nobodysu I can confirm, that after the zabbix-agent has been downgraded to 3.2.7, LLD works fine and my drives get discovered correctly. Zabbix-server is still in version 3.4.1 and this combination seems to work fine. So my particular problem is thereby solved, but I will keep my eye on this issue, for the case You will need some further information from me (help with tests once the bug will be fixed, for example,...). Thank You very much for Your help - You are doing a great work. R.

nobody43 commented 7 years ago

Nailed it! It's my bug actually :). A typo. :( Zabbix 3.4 introduced error code checking (without error reporting), and my error code was in wrong place.

Just comment it out and after that you can use agent 3.4. 213: #sys.exit(1)

dizzy2 commented 7 years ago

hello, sorry for replying so late - unfortunately there is something wrong with the fix. I made a simple try - upgraded zabbix agent to 3.4, dowloaded the fixed version of lld script and no more data are coming again. Is there anything I'm missing?

nobody43 commented 7 years ago

Are you completely sure? Right now I'm using server 2.4 and 3.4.1, agent 3.4.0 and 3.4.1 on barebone install from current repo and everything is working fine on Win 7 and Debian 8: LLD and item's data. Try to remove all old scripts, then download new. Perform 'Unlink and clear' on your host. Re-add the template to host. Recheck the logs.

Anyone, please respond about your situation with 3.4!

dizzy2 commented 7 years ago

I have reviewed my configuration one more time properly - the problem was really on my side (wrong permissions). I am really sorry for my previous post :-( After I have fixed the permissions, Your new script has started to send the data again and everything seems to be working with zabbix 3.4 too. Closing the bug as fixed.

Thank You for Your help.

nobody43 commented 7 years ago

Great! It's working then.

Garincho commented 7 years ago

Thank You. it's working.