severalnines / s9s-admin

s9s-admin tools, like s9s_backup, s9s_haproxy, s9s_change_passwd has a home here.
50 stars 31 forks source link

ClusterControl Template for Zabbix 2.0 (Zabbix Version 4.4.5) #8

Open Rick-McClatchie opened 4 years ago

Rick-McClatchie commented 4 years ago

Hello everyone,

I was wondering if there is any way to get the Clustercontrol Template to Work with the current Zabbix Versions? We followed all of the steps descibed here: https://github.com/severalnines/s9s-admin/tree/master/plugins/zabbix

However these are the "Problems" I get in Zabbix:

Screenshot 2020-01-30 at 15 50 44

When I take a look in the Clustercontrol Dashboard. No Problems:

Screenshot 2020-01-30 at 15 51 56

When I turn on logging for Zabbix I get the following messages in /var/log/zabbix-agent/zabbix_agentd.log:

 13762:20200130:155353.893 EXECUTE_STR() command:'/var/lib/zabbix/clustercontrol/scripts/clustercontrol_stats.sh 5 alarms-critical' len:1 cmd_result:'1'
 13762:20200130:155353.893 Sending back [1]
 13762:20200130:155353.893 __zbx_zbx_setproctitle() title:'listener #3 [waiting for connection]'
 13758:20200130:155353.924 __zbx_zbx_setproctitle() title:'collector [processing data]'
 13758:20200130:155353.924 In update_cpustats()
 13758:20200130:155353.924 End of update_cpustats()
 13758:20200130:155353.924 __zbx_zbx_setproctitle() title:'collector [idle 1 sec]'
 13759:20200130:155354.398 __zbx_zbx_setproctitle() title:'listener #1 [processing request]'
 13759:20200130:155354.398 Requested [clustercontrol.db.alarms-warning]
 13759:20200130:155354.398 In zbx_popen() command:'/var/lib/zabbix/clustercontrol/scripts/clustercontrol_stats.sh 5 alarms-warning'
 13759:20200130:155354.399 End of zbx_popen():7
  4777:20200130:155354.399 zbx_popen(): executing script
 13759:20200130:155354.454 In zbx_waitpid()
 13759:20200130:155354.454 zbx_waitpid() exited, status:0
 13759:20200130:155354.454 End of zbx_waitpid():4777
 13759:20200130:155354.454 EXECUTE_STR() command:'/var/lib/zabbix/clustercontrol/scripts/clustercontrol_stats.sh 5 alarms-warning' len:1 cmd_result:'1'
 13759:20200130:155354.454 Sending back [1]
 13759:20200130:155354.454 __zbx_zbx_setproctitle() title:'listener #1 [waiting for connection]'
 13763:20200130:155354.475 In send_buffer() host:'127.0.0.1' port:10051 entries:0/100
 13763:20200130:155354.475 End of send_buffer():SUCCEED
 13763:20200130:155354.475 __zbx_zbx_setproctitle() title:'active checks #1 [idle 1 sec]'
 13758:20200130:155354.925 __zbx_zbx_setproctitle() title:'collector [processing data]'
 13758:20200130:155354.925 In update_cpustats()
 13758:20200130:155354.925 End of update_cpustats()
 13758:20200130:155354.925 __zbx_zbx_setproctitle() title:'collector [idle 1 sec]'
 13763:20200130:155355.476 In send_buffer() host:'127.0.0.1' port:10051 entries:0/100
 13763:20200130:155355.476 End of send_buffer():SUCCEED
 13763:20200130:155355.476 __zbx_zbx_setproctitle() title:'active checks #1 [idle 1 sec]'
 13760:20200130:155355.492 __zbx_zbx_setproctitle() title:'listener #2 [processing request]'
 13760:20200130:155355.492 Requested [clustercontrol.db.status]
 13760:20200130:155355.492 In zbx_popen() command:'/var/lib/zabbix/clustercontrol/scripts/clustercontrol_stats.sh 5 cluster'
 13760:20200130:155355.493 End of zbx_popen():7
  4781:20200130:155355.493 zbx_popen(): executing script
 13760:20200130:155355.561 In zbx_waitpid()
 13760:20200130:155355.561 zbx_waitpid() exited, status:0
 13760:20200130:155355.561 End of zbx_waitpid():4781
 13760:20200130:155355.561 EXECUTE_STR() command:'/var/lib/zabbix/clustercontrol/scripts/clustercontrol_stats.sh 5 cluster' len:1 cmd_result:'1'

Thanks in advance everybody.

Best Rick

uksza commented 4 years ago

That's because in PHP script we have:
// return 1 if ok, return 0 if critical, return 2 if degraded, return 3 if unknown/problem

but in Zabbix trigger we have <expression>{ClusterControl Template:clustercontrol.db.alarms-critical.last(0)}&gt;0</expression>

For script returned 1 means GOOD, but for trigger 1 means NOT GOOD

Zabbix triggers should be replaced like this: <expression>{ClusterControl Template:clustercontrol.db.alarms-critical.last(0)}&lt;&gt;1</expression>