Closed ficofer closed 6 years ago
/usr/local/bin/haproxy_discovery.sh /var/run/haproxy/info.sock FRONTEND
{
"data":[
{
"{#FRONTEND_NAME}":"ft_http_web"},
{
"{#FRONTEND_NAME}":"ft_https_web"},
{
"{#FRONTEND_NAME}":"stats"}]}
From what I can see in the logs it should be showing info....
28175:20180319:124446.628 In zbx_popen() command:'/usr/local/bin/haproxy_stats.sh /var/run/haproxy/info.sock bk_https_web HTTP_id-prod-app12 slim'
28177:20180319:124446.628 __zbx_zbx_setproctitle() title:'listener #3 [processing request]'
28177:20180319:124446.628 Requested [haproxy.stats[/var/run/haproxy/info.sock,bk_https_web,HTTP_id-prod-app02,rate_max]]
28177:20180319:124446.628 In zbx_popen() command:'/usr/local/bin/haproxy_stats.sh /var/run/haproxy/info.sock bk_https_web HTTP_id-prod-app02 rate_max'
28177:20180319:124446.628 End of zbx_popen():7
10654:20180319:124446.628 zbx_popen(): executing script
28175:20180319:124446.628 End of zbx_popen():7
10655:20180319:124446.629 zbx_popen(): executing script
28176:20180319:124446.634 __zbx_zbx_setproctitle() title:'listener #2 [processing request]'
28176:20180319:124446.634 Requested [haproxy.stats[/var/run/haproxy/info.sock,bk_https_web,HTTP_id-prod-app06,qcur]]
28176:20180319:124446.634 In zbx_popen() command:'/usr/local/bin/haproxy_stats.sh /var/run/haproxy/info.sock bk_https_web HTTP_id-prod-app06 qcur'
28176:20180319:124446.635 End of zbx_popen():7
10662:20180319:124446.635 zbx_popen(): executing script
28177:20180319:124446.647 In zbx_waitpid()
28177:20180319:124446.647 zbx_waitpid() exited, status:0
28177:20180319:124446.647 End of zbx_waitpid():10654
28177:20180319:124446.647 EXECUTE_STR() command:'/usr/local/bin/haproxy_stats.sh /var/run/haproxy/info.sock bk_https_web HTTP_id-prod-app02 rate_max' len:1 cmd_result:'1'
28177:20180319:124446.647 Sending back [1]
28177:20180319:124446.648 __zbx_zbx_setproctitle() title:'listener #3 [waiting for connection]'
28175:20180319:124446.648 In zbx_waitpid()
28175:20180319:124446.648 zbx_waitpid() exited, status:0
28175:20180319:124446.648 End of zbx_waitpid():10655
28175:20180319:124446.648 EXECUTE_STR() command:'/usr/local/bin/haproxy_stats.sh /var/run/haproxy/info.sock bk_https_web HTTP_id-prod-app12 slim' len:1 cmd_result:'0'
28175:20180319:124446.648 Sending back [0]
28175:20180319:124446.648 __zbx_zbx_setproctitle() title:'listener #1 [waiting for connection]'
28176:20180319:124446.655 In zbx_waitpid()
28176:20180319:124446.655 zbx_waitpid() exited, status:0
28176:20180319:124446.655 End of zbx_waitpid():10662
28176:20180319:124446.655 EXECUTE_STR() command:'/usr/local/bin/haproxy_stats.sh /var/run/haproxy/info.sock bk_https_web HTTP_id-prod-app06 qcur' len:1 cmd_result:'0'
28176:20180319:124446.655 Sending back [0]
28176:20180319:124446.655 __zbx_zbx_setproctitle() title:'listener #2 [waiting for connection]'
28178:20180319:124446.813 In send_buffer() host:'127.0.0.1' port:10051 entries:0/100
28178:20180319:124446.814 End of send_buffer():SUCCEED
28178:20180319:124446.814 __zbx_zbx_setproctitle() title:'active checks #1 [idle 1 sec]'
28174:20180319:124446.826 __zbx_zbx_setproctitle() title:'collector [processing data]'
28174:20180319:124446.826 In update_cpustats()
28174:20180319:124446.826 End of update_cpustats()
28174:20180319:124446.826 __zbx_zbx_setproctitle() title:'collector [idle 1 sec]'
28177:20180319:124447.650 __zbx_zbx_setproctitle() title:'listener #3 [processing request]'
28177:20180319:124447.650 Requested [haproxy.stats[/var/run/haproxy/info.sock,bk_https_web,HTTP_id-prod-app01,smax]]
28177:20180319:124447.650 In zbx_popen() command:'/usr/local/bin/haproxy_stats.sh /var/run/haproxy/info.sock bk_https_web HTTP_id-prod-app01 smax'
28175:20180319:124447.650 __zbx_zbx_setproctitle() title:'listener #1 [processing request]'
28175:20180319:124447.650 Requested [net.if.discovery]
28177:20180319:124447.650 End of zbx_popen():7
Some of the stats seems to be retrieving info, but they just show 0... I dont think this affect or is related to the HAproxy version running is it ?
I rule out also a version issue, further investigation show an problem with this:
# check if requested stat is supported
if [ -z "${_STAT}" ]
then
echo "ERROR: $stat is unsupported"
exit 127
fi
Looks like all or at least 99% of my stats are giving not supported.
# /usr/local/bin/haproxy_stats.sh prod-app10 smax
ERROR: is unsupported
# /usr/local/bin/haproxy_stats.sh prod-app10 status
ERROR: is unsupported
# /usr/local/bin/haproxy_stats.sh prod-app10 downtime
ERROR: is unsupported
#
@ficofer, try checking your cache file /var/tmp/haproxy_stats.cache
does it appear to have values?
also, when running the script manually, consider https://github.com/anapsix/zabbix-haproxy/blob/master/userparameter_haproxy.conf#L87-L89
/usr/local/bin/haproxy_stats.sh ft_http_web prod-app10 smax
@anapsix Thanks for replying/////
It does appear to show values.
look below:
# tail -f /var/tmp/haproxy_stats.cache
bk_https_web,HTTP_id-prod-app07,0,0,0,0,,0,0,0,,0,,0,0,0,0,UP,1,1,0,0,0,3296,0,,1,4,7,,0,,2,0,,0,L4OK,,0,,,,,,,,,,,0,0,,,,,-1,,,0,0,0,0,,,,Layer4 check passed,,2,3,4,,,,,,tcp,,,,,,,,
bk_https_web,HTTP_id-prod-app08,0,0,0,0,,0,0,0,,0,,0,0,0,0,UP,1,1,0,0,0,3296,0,,1,4,8,,0,,2,0,,0,L4OK,,0,,,,,,,,,,,0,0,,,,,-1,,,0,0,0,0,,,,Layer4 check passed,,2,3,4,,,,,,tcp,,,,,,,,
bk_https_web,HTTP_id-prod-app09,0,0,0,0,,0,0,0,,0,,0,0,0,0,UP,1,1,0,0,0,3296,0,,1,4,9,,0,,2,0,,0,L4OK,,0,,,,,,,,,,,0,0,,,,,-1,,,0,0,0,0,,,,Layer4 check passed,,2,3,4,,,,,,tcp,,,,,,,,
bk_https_web,HTTP_id-prod-app10,0,0,0,0,,0,0,0,,0,,0,0,0,0,UP,1,1,0,0,0,3296,0,,1,4,10,,0,,2,0,,0,L4OK,,0,,,,,,,,,,,0,0,,,,,-1,,,0,0,0,0,,,,Layer4 check passed,,2,3,4,,,,,,tcp,,,,,,,,
bk_https_web,HTTP_id-prod-app11,0,0,0,0,,0,0,0,,0,,0,0,0,0,UP,1,1,0,0,0,3296,0,,1,4,11,,0,,2,0,,0,L4OK,,0,,,,,,,,,,,0,0,,,,,-1,,,0,0,0,0,,,,Layer4 check passed,,2,3,4,,,,,,tcp,,,,,,,,
bk_https_web,HTTP_id-prod-app12,0,0,0,0,,0,0,0,,0,,0,0,0,0,UP,1,1,0,0,0,3296,0,,1,4,12,,0,,2,0,,0,L4OK,,0,,,,,,,,,,,0,0,,,,,-1,,,0,0,0,0,,,,Layer4 check passed,,2,3,4,,,,,,tcp,,,,,,,,
bk_https_web,BACKEND,0,0,0,1,9500,5,6617,64494,0,0,,0,0,0,0,UP,12,12,0,,0,3296,0,,1,4,0,,4,,1,0,,2,,,,,,,,,,,,,,0,0,0,0,0,0,1381,,,1,1,0,166,,,,,,,,,,,,,,tcp,,,,,,,,
stats,FRONTEND,,,0,2,2000,6,12642,851647,0,0,3,,,,,OPEN,,,,,,,,,1,5,0,,,,0,0,0,1,,,,0,28,0,4,0,0,,0,4,32,,,0,0,0,0,,,,,,,,,,,,,,,,,,,,,http,,0,1,6,28,0,0,
stats,BACKEND,0,0,0,0,200,0,12642,851647,0,0,,0,0,0,0,UP,0,0,0,,0,3296,,,1,5,0,,0,,1,0,,0,,,,0,0,0,0,0,0,,,,0,0,0,0,0,0,0,1357,,,0,0,0,223,,,,,,,,,,,,,,http,,,,,,,,
But a lot of 0... may be thats the why of NO DATA... I am using HAProxy HA-Proxy version 1.8.4-1deb90d 2018/02/08
# /usr/local/bin/haproxy_stats.sh /var/run/haproxy/info.sock bk_https_web HTTP_id-prod-app10 smax
DEBUG: SOCAT_BIN => /bin/socat
DEBUG: NC_BIN => /bin/nc
DEBUG: FLOCK_BIN => /bin/flock
DEBUG: FLOCK_WAIT => 15 seconds
DEBUG: CACHE_FILEPATH =>
DEBUG: CACHE_EXPIRATION => minutes
DEBUG: HAPROXY_SOCKET => /var/run/haproxy/info.sock
DEBUG: pxname => bk_https_web
DEBUG: svname => HTTP_id-prod-app10
DEBUG: stat => smax
DEBUG: _STAT => 6:smax:0
DEBUG: _INDEX => 6
DEBUG: _DEFAULT => 0
DEBUG: using default get() method
DEBUG: stat file found, results are at most 5 minutes stale..
0
well, that takes care of your ERROR: is unsupported
As for having most values at "0", it's coming from HAProxy.
Don't forget that data is cached, based on CACHE_EXPIRATION
set in the script.
You should be seeing data at 0, if that's what it is.
However, when you are running the script manually, chances are, you are running it as root user. So file permissions will be messed up. You need to change cache files permissions to allow whatever user zabbix agent is running as to read and write to them. Or just delete the cache files and they will be recreated when zabbix agent runs the script.
I read that it could be the cause, but I checked it:
-rw-rw-r--. 1 zabbix zabbix 3690 Mar 19 15:38 /var/tmp/haproxy_stats.cache
and it looks like they are correctly owned.
The thing is shouldn't Zabbix be graphing historic even though the values can now be 0 ?
I'd expect that as well.. check latest data, as it's easier to see what zabbix received from the script
I see this:
Any idea why so many 0 ?
It should not be like that right ?
Also... in the output of zabbix_agentd -p this did not change though :-1:
haproxy.stats [t|ERROR: is unsupported]
haproxy.stat.qcur [t|ERROR: is unsupported]
haproxy.stat.qmax [t|ERROR: is unsupported]
haproxy.stat.scur [t|ERROR: is unsupported]
haproxy.stat.smax [t|ERROR: is unsupported]
haproxy.stat.slim [t|ERROR: is unsupported]
haproxy.stat.bin [t|ERROR: is unsupported]
haproxy.stat.bout [t|ERROR: is unsupported]
haproxy.stat.dreq [t|ERROR: is unsupported]
haproxy.stat.dresp [t|ERROR: is unsupported]
haproxy.stat.ereq [t|ERROR: is unsupported]
haproxy.stat.econ [t|ERROR: is unsupported]
haproxy.stat.eresp [t|ERROR: is unsupported]
haproxy.stat.wretr [t|ERROR: is unsupported]
haproxy.stat.wredis [t|ERROR: is unsupported]
haproxy.stat.weight [t|ERROR: is unsupported]
haproxy.stat.act [t|ERROR: is unsupported]
haproxy.stat.bck [t|ERROR: is unsupported]
haproxy.stat.chkfail [t|ERROR: is unsupported]
haproxy.stat.chkdown [t|ERROR: is unsupported]
haproxy.stat.lastchg [t|ERROR: is unsupported]
haproxy.stat.downtime [t|ERROR: is unsupported]
haproxy.stat.qlimit [t|ERROR: is unsupported]
haproxy.stat.throttle [t|ERROR: is unsupported]
haproxy.stat.lbtot [t|ERROR: is unsupported]
haproxy.stat.tracked [t|ERROR: is unsupported]
haproxy.stat.type [t|ERROR: is unsupported]
haproxy.stat.rate [t|ERROR: is unsupported]
haproxy.stat.rate_lim [t|ERROR: is unsupported]
haproxy.stat.rate_max [t|ERROR: is unsupported]
haproxy.stat.check_status [t|ERROR: is unsupported]
haproxy.stat.check_code [t|ERROR: is unsupported]
haproxy.stat.check_duration [t|ERROR: is unsupported]
haproxy.stat.req_rate [t|ERROR: is unsupported]
haproxy.stat.req_rate_max [t|ERROR: is unsupported]
haproxy.stat.req_tot [t|ERROR: is unsupported]
haproxy.stat.cli_abrt [t|ERROR: is unsupported]
haproxy.stat.srv_abrt [t|ERROR: is unsupported]
haproxy.stat.comp_in [t|ERROR: is unsupported]
haproxy.stat.comp_out [t|ERROR: is unsupported]
haproxy.stat.comp_byp [t|ERROR: is unsupported]
haproxy.stat.comp_rsp [t|ERROR: is unsupported]
haproxy.stat.lastsess [t|ERROR: is unsupported]
haproxy.stat.qtime [t|ERROR: is unsupported]
haproxy.stat.ctime [t|ERROR: is unsupported]
haproxy.stat.rtime [t|ERROR: is unsupported]
haproxy.stat.status [t|ERROR: is unsupported]
haproxy.stat.pid [t|ERROR: is unsupported]
haproxy.stat.iid [t|ERROR: is unsupported]
haproxy.stat.sid [t|ERROR: is unsupported]
haproxy.stat.hrsp_1xx [t|ERROR: is unsupported]
haproxy.stat.hrsp_2xx [t|ERROR: is unsupported]
haproxy.stat.hrsp_3xx [t|ERROR: is unsupported]
haproxy.stat.hrsp_4xx [t|ERROR: is unsupported]
haproxy.stat.hrsp_5xx [t|ERROR: is unsupported]
haproxy.stat.hrsp_other [t|ERROR: is unsupported]
haproxy.stat.hanafail [t|ERROR: is unsupported]
haproxy.stat.last_chk [t|ERROR: is unsupported]
haproxy.stat.last_agt [t|ERROR: is unsupported]
You must be missing a macro var, out have it set to empty string. It's rather difficult for me to guess
Also, you seem to be running constant debug.. which is messing with output
@anapsix I have disable debug, what its weird to me is that so many stats are not supported...
What would you advised me to check to figure out if it is this ? macro var,
???
Thanks for your guide!
Only haproxy.list.discovery seems to be supported in the output of zabbix_agentd
do understand why items were not supported, while you had DEBUG enabled in the script?
@anapsix
Not really, because even with the DEBUG ON the not supported exit before doing anything in the script.
It exist in this IF
# check if requested stat is supported
if [ -z "${_STAT}" ]
then
echo "ERROR: $stat is unsupported"
exit 127
fi
Looks like all or at least 99% of my stats are giving not supported.
# /usr/local/bin/haproxy_stats.sh prod-app10 smax
ERROR: is unsupported
# /usr/local/bin/haproxy_stats.sh prod-app10 status
ERROR: is unsupported
# /usr/local/bin/haproxy_stats.sh prod-app10 downtime
ERROR: is unsupported
#
Any other way to debug that?
Is not like I have an output like this:
DEBUG: SOCAT_BIN => /bin/socat
DEBUG: NC_BIN => /bin/nc
DEBUG: FLOCK_BIN => /bin/flock
DEBUG: FLOCK_WAIT => 15 seconds
DEBUG: CACHE_FILEPATH =>
DEBUG: CACHE_EXPIRATION => minutes
DEBUG: HAPROXY_SOCKET => /var/run/haproxy/info.sock
DEBUG: pxname => bk_https_web
DEBUG: svname => HTTP_id-prod-app10
DEBUG: stat => smax
DEBUG: _STAT => 6:smax:0
DEBUG: _INDEX => 6
DEBUG: _DEFAULT => 0
DEBUG: using default get() method
DEBUG: stat file found, results are at most 5 minutes stale..
0
as I've mentioned /usr/local/bin/haproxy_stats.sh prod-app10 smax
is incorrect syntax, you are missing required variables. See example here
As for the 0
, if that's what coming from HAProxy, that's what it is.
You can check stats directly from HAProxy
echo "show stat" | nc -U /var/run/haproxy/info.sock | cut -d "," -f 1,2,5-11,18,24,27,30,36,50,37,56,57,62 | column -s, -t
I follow the README and all seems to be workign fine. Scripts ran fron the haproxy node returns info that make me think the sockets is being opened and read....
I have installed socat and nc, because that was my first stopper... but now in Zabbix Web UI I see NO DATA and in the server I see this:
haproxy:x:188:188:haproxy:/var/lib/haproxy:/sbin/nologin haproxy.list.discovery [t|{ haproxy.stats [t|ERROR: is unsupported] haproxy.stat.qcur [t|ERROR: is unsupported] haproxy.stat.qmax [t|ERROR: is unsupported] haproxy.stat.scur [t|ERROR: is unsupported] haproxy.stat.smax [t|ERROR: is unsupported] haproxy.stat.slim [t|ERROR: is unsupported] haproxy.stat.bin [t|ERROR: is unsupported] haproxy.stat.bout [t|ERROR: is unsupported] haproxy.stat.dreq [t|ERROR: is unsupported] haproxy.stat.dresp [t|ERROR: is unsupported] haproxy.stat.ereq [t|ERROR: is unsupported] haproxy.stat.econ [t|ERROR: is unsupported] haproxy.stat.eresp [t|ERROR: is unsupported] haproxy.stat.wretr [t|ERROR: is unsupported] haproxy.stat.wredis [t|ERROR: is unsupported] haproxy.stat.weight [t|ERROR: is unsupported] haproxy.stat.act [t|ERROR: is unsupported] haproxy.stat.bck [t|ERROR: is unsupported] haproxy.stat.chkfail [t|ERROR: is unsupported] haproxy.stat.chkdown [t|ERROR: is unsupported] haproxy.stat.lastchg [t|ERROR: is unsupported] haproxy.stat.downtime [t|ERROR: is unsupported] haproxy.stat.qlimit [t|ERROR: is unsupported] haproxy.stat.throttle [t|ERROR: is unsupported] haproxy.stat.lbtot [t|ERROR: is unsupported] haproxy.stat.tracked [t|ERROR: is unsupported] haproxy.stat.type [t|ERROR: is unsupported] haproxy.stat.rate [t|ERROR: is unsupported] haproxy.stat.rate_lim [t|ERROR: is unsupported] haproxy.stat.rate_max [t|ERROR: is unsupported] haproxy.stat.check_status [t|ERROR: is unsupported] haproxy.stat.check_code [t|ERROR: is unsupported] haproxy.stat.check_duration [t|ERROR: is unsupported] haproxy.stat.req_rate [t|ERROR: is unsupported] haproxy.stat.req_rate_max [t|ERROR: is unsupported] haproxy.stat.req_tot [t|ERROR: is unsupported] haproxy.stat.cli_abrt [t|ERROR: is unsupported] haproxy.stat.srv_abrt [t|ERROR: is unsupported] haproxy.stat.comp_in [t|ERROR: is unsupported] haproxy.stat.comp_out [t|ERROR: is unsupported] haproxy.stat.comp_byp [t|ERROR: is unsupported] haproxy.stat.comp_rsp [t|ERROR: is unsupported] haproxy.stat.lastsess [t|ERROR: is unsupported] haproxy.stat.qtime [t|ERROR: is unsupported] haproxy.stat.ctime [t|ERROR: is unsupported] haproxy.stat.rtime [t|ERROR: is unsupported] haproxy.stat.status [t|ERROR: is unsupported] haproxy.stat.pid [t|ERROR: is unsupported] haproxy.stat.iid [t|ERROR: is unsupported] haproxy.stat.sid [t|ERROR: is unsupported] haproxy.stat.hrsp_1xx [t|ERROR: is unsupported] haproxy.stat.hrsp_2xx [t|ERROR: is unsupported] haproxy.stat.hrsp_3xx [t|ERROR: is unsupported] haproxy.stat.hrsp_4xx [t|ERROR: is unsupported] haproxy.stat.hrsp_5xx [t|ERROR: is unsupported] haproxy.stat.hrsp_other [t|ERROR: is unsupported] haproxy.stat.hanafail [t|ERROR: is unsupported] haproxy.stat.last_chk [t|ERROR: is unsupported] haproxy.stat.last_agt [t|ERROR: is unsupported]
Not much info to debug any hint ? I rule out a permission issue as I double check that and the socket is being open 666. zabbix_agentd (daemon) (Zabbix) 3.0.15
@anapsix really cool things here thanks for your support!