munin-monitoring / munin

Main repository for munin master / node / plugins
http://munin-monitoring.org
Other
2k stars 474 forks source link

[BUG?] Could not graph "haproxy_be_sessions" Munin v. 2.0.33-1 #1243

Open lord-syrius opened 5 years ago

lord-syrius commented 5 years ago

Hello there,

I have the following issue when drawing the graph using the plugin "haproxy". I'm experiencing this issue ONLY with the graph "haproxy_be_sessions". The other graphs work very well Here below my logs:

2019/10/25 10:57:28 [WARNING] Could not draw graph "/var/cache/munin/www/haproxyng_be_sessions/bk_apache/web_no_cache-month.png":/var/cache/munin/www//haproxyng_be_sessions/bk_apache/web_no_cache-month.png 2019/10/25 10:57:28 [RRD ERROR] Unable to graph /var/cache/munin/www//haproxyng_be_sessions/bk_apache/web_no_cache-year.png : undefined vname cbk_apache_queued 2019/10/25 10:57:28 [RRD ERROR] rrdtool 'graph' '/var/cache/munin/www/PRD/butprdlamp1/haproxyng_be_sessions/bk_apache/web_no_cache-year.png' \

Is that a bug? Do you have a workaround for that?

Thank you!

Best regards,

Flavio

sumpfralle commented 5 years ago

Could you please share the output of munin-run haproxy_ng?

lord-syrius commented 5 years ago

Hello sumpfralle,

Yes, sure! In bold the output of the graphs currently having this issue. Data is correctly coming according to this output but *.png are not reproduced in respective folders /var/cache/munin/www/haproxyng_be_sessions/ as usual, then showing a 404 not found error.

Please note that this affects all our servers currently in production and on several projects using this Munin version (and with different HAproxy versions), that's why I opened this issue.

munin-run haproxyng

multigraph haproxyng_fe_bandwidth but_front_in.value 31921784263 but_front_out.value 553192453404

multigraph haproxyng_be_bandwidth bk_apache_in.value 0 bk_apache_out.value 0 bk_varnish_in.value 31799673503 bk_varnish_out.value 552527502955

multigraph haproxyng_be_bandwidth.bk_apache.web_no_cache total_in.value 0 total_out.value 0

multigraph haproxyng_be_bandwidth.bk_apache total_in.value 0 total_out.value 0 web_no_cache_in.value 0 web_no_cache_out.value 0

multigraph haproxyng_be_bandwidth.bk_varnish.web_cache total_in.value 31799554971 total_out.value 552527502955

multigraph haproxyng_be_bandwidth.bk_varnish total_in.value 31799673503 total_out.value 552527502955 web_cache_in.value 31799554971 web_cache_out.value 552527502955

multigraph haproxyng_fe_bandwidth.but_front total_in.value 31921784263 total_out.value 553192453404

multigraph haproxyng_be_timing bk_apache.value 0 bk_varnish.value 1498

multigraph haproxyng_be_timing.bk_apache.web_no_cache connect.value 0 queue.value 0 response.value 0 total.value 0

multigraph haproxyng_be_timing.bk_apache web_no_cache.value 0

multigraph haproxyng_be_timing.bk_varnish.web_cache connect.value 0 queue.value 0 response.value 2 total.value 1498

multigraph haproxyng_be_timing.bk_varnish web_cache.value 1498

multigraph haproxyng_fe_sessions.but_front sessions.value 1804

multigraph haproxyng_fe_sessions but_front_sessions.value 1804

multigraph haproxyng_be_sessions bk_apache_queued.value 0 bk_apache_sessions.value 0 bk_varnish_queued.value 0 bk_varnish_sessions.value 2

multigraph haproxyng_be_sessions.bk_apache.web_no_cache queued.value 0 sessions.value 0

multigraph haproxyng_be_sessions.bk_apache total_queued.value 0 total_sessions.value 0 web_no_cache_queued.value 0 web_no_cache_sessions.value 0

multigraph haproxyng_be_sessions.bk_varnish.web_cache queued.value 0 sessions.value 1

multigraph haproxyng_be_sessions.bk_varnish total_queued.value 0 total_sessions.value 2 web_cache_queued.value 0 web_cache_sessions.value 1

multigraph haproxyng_be_count.bk_apache backup.value 0 disabled.value 0 down.value 0 up.value 0

multigraph haproxyng_be_count.bk_varnish backup.value 0 disabled.value 0 down.value 0 up.value 0

multigraph haproxyng_be_count bk_apache.value 0 bk_varnish.value 0

multigraph haproxyng_fe_responses.but_front http1xx.value 0 http2xx.value 24648507 http3xx.value 5396818 http4xx.value 4297280 http5xx.value 467 httpxxx.value 399 total.value 34343473

multigraph haproxyng_fe_responses.but_front.http1xx responses.value 0

multigraph haproxyng_fe_responses.but_front.http2xx responses.value 24648507

multigraph haproxyng_fe_responses.but_front.http3xx responses.value 5396818

multigraph haproxyng_fe_responses.but_front.http4xx responses.value 4297280

multigraph haproxyng_fe_responses.but_front.http5xx responses.value 467

multigraph haproxyng_fe_responses.but_front.httpxxx responses.value 399

multigraph haproxyng_fe_responses.but_front.total responses.value 34343473

multigraph haproxyng_fe_responses but_front.value 34343473

multigraph haproxyng_be_responses.bk_apache http1xx.value 0 http2xx.value 0 http3xx.value 0 http4xx.value 0 http5xx.value 0 httpxxx.value 0 total.value 0

multigraph haproxyng_be_responses.bk_apache.http1xx.web_no_cache responses.value 0

multigraph haproxyng_be_responses.bk_apache.http1xx total.value 0 web_no_cache.value 0

multigraph haproxyng_be_responses.bk_apache.http2xx.web_no_cache responses.value 0

multigraph haproxyng_be_responses.bk_apache.http2xx total.value 0 web_no_cache.value 0

multigraph haproxyng_be_responses.bk_apache.http3xx.web_no_cache responses.value 0

multigraph haproxyng_be_responses.bk_apache.http3xx total.value 0 web_no_cache.value 0

multigraph haproxyng_be_responses.bk_apache.http4xx.web_no_cache responses.value 0

multigraph haproxyng_be_responses.bk_apache.http4xx total.value 0 web_no_cache.value 0

multigraph haproxyng_be_responses.bk_apache.http5xx.web_no_cache responses.value 0

multigraph haproxyng_be_responses.bk_apache.http5xx total.value 0 web_no_cache.value 0

multigraph haproxyng_be_responses.bk_apache.httpxxx.web_no_cache responses.value 0

multigraph haproxyng_be_responses.bk_apache.httpxxx total.value 0 web_no_cache.value 0

multigraph haproxyng_be_responses.bk_apache.total.web_no_cache responses.value 0

multigraph haproxyng_be_responses.bk_apache.total total.value 0 web_no_cache.value 0

multigraph haproxyng_be_responses.bk_varnish http1xx.value 0 http2xx.value 24648507 http3xx.value 5109667 http4xx.value 822109 http5xx.value 467 httpxxx.value 399 total.value 30581149

multigraph haproxyng_be_responses.bk_varnish.http1xx.web_cache responses.value 0

multigraph haproxyng_be_responses.bk_varnish.http1xx total.value 0 web_cache.value 0

multigraph haproxyng_be_responses.bk_varnish.http2xx.web_cache responses.value 24648509

multigraph haproxyng_be_responses.bk_varnish.http2xx total.value 24648507 web_cache.value 24648509

multigraph haproxyng_be_responses.bk_varnish.http3xx.web_cache responses.value 5109667

multigraph haproxyng_be_responses.bk_varnish.http3xx total.value 5109667 web_cache.value 5109667

multigraph haproxyng_be_responses.bk_varnish.http4xx.web_cache responses.value 822108

multigraph haproxyng_be_responses.bk_varnish.http4xx total.value 822109 web_cache.value 822108

multigraph haproxyng_be_responses.bk_varnish.http5xx.web_cache responses.value 9

multigraph haproxyng_be_responses.bk_varnish.http5xx total.value 467 web_cache.value 9

multigraph haproxyng_be_responses.bk_varnish.httpxxx.web_cache responses.value 0

multigraph haproxyng_be_responses.bk_varnish.httpxxx total.value 399 web_cache.value 0

multigraph haproxyng_be_responses.bk_varnish.total.web_cache responses.value 30580293

multigraph haproxyng_be_responses.bk_varnish.total total.value 30581149 web_cache.value 30580293

multigraph haproxyng_be_responses bk_apache.value 0 bk_varnish.value 30581149

Thanks for your help!

sumpfralle commented 5 years ago

Thank you for providing the output!

Which munin version do you use at the moment? Did it work with an older version?

sumpfralle commented 5 years ago

@lord-syrius: I just noticed, that you mentioned your version of munin in the bug title. Since this version (2.0.33) is already quite old (2,5 years, 18 releases were published since then), I would recommend to verify, that the issue still exists in a more recent release.

lord-syrius commented 5 years ago

Hello @sumpfralle ,

Thanks for your reply and I actually forwarded the issue experienced on v. 2.0.33. But we have the same behaviour with the same graphs on a newer release , like the 2.0.49, so the issue still persists. Can you please check on your end?

Thanks!

sumpfralle commented 5 years ago

Can you please check on your end?

Yes, I can do this. But I lack a host running haproxy. But I would work around this, if you could send me the following two files:

munin-run haproxyng config >haproxy.config
munin-run haproxyng config >haproxy.values
lord-syrius commented 5 years ago

Yes, sure @sumpfralle ! Find attached here the requested output. haproxy.config.txt haproxy.values.txt

Thanks for your help.

EvilWrangler commented 5 years ago

Observation: Error says: undefined vname cbk_apache_queued Output says: bk_apache_queued (no 'c') Typographical error in the configuration?

sumpfralle commented 5 years ago

Find attached here the requested output.

@lord-syrius: sorry - my request contained a mistake. Please re-run the following:

munin-run haproxyng >haproxy.values

Thank you!

Typographical error in the configuration?

@EvilWrangler: indeed the c prefix is probably correct, since munin applies some name-munging before handing the data over to rrd. Thus the problem should be elsewhere.

lord-syrius commented 5 years ago

Find attached what requested

Awaiting for your news. Thank you @sumpfralle @EvilWrangler ! output haproxyng.txt

sumpfralle commented 5 years ago

Thank you for the output file.

I used your files in the following way (as a dummy plugin):

#!/bin/sh

if [ "${1:-}" = "config" ]; then
    cat /root/haproxy.config.txt
else
    cat /root/haproxy.values.txt
fi

The above seems to work for me and produces proper graphs (with flat lines for all values).

Thus I am confused now. Maybe there is some broken state in your setup?

Maybe try the following:

cp /var/lib/munin/datafile /var/lib/munin/datafile.orig
sed -i /haproxy/d /var/lib/munin/datafile

This should remove all haproxy-related states from munin's storage. Maybe it works afterwards? (in this case I would be interested in the content of the above /var/lib/munin/datafile.orig)

lord-syrius commented 5 years ago

Hi @sumpfralle ,

That didn't work unfortunately. Graphs are still not shown, while stats are successfully sent to the server (as datafile.orig.txt datafile.orig.txt attached here)

Thanks once more for your help!

sumpfralle commented 5 years ago

Do you still see the same error messages (see your original post) in your log file?

lord-syrius commented 4 years ago

Hi @sumpfralle,

The issue is still present. Thanks for your help.

sumpfralle commented 4 years ago

@lord-syrius: I am trying to summarize the issue:

Based on the above situation I can only think of rrdtool as being a potential source of problems. This is quite unlikely, but I see not other difference between my setup and yours. Which version of rrdtool do you use?

Or maybe you have other ideas?

lord-syrius commented 4 years ago

Hello @sumpfralle , Thanks for resuming the issue, that's exactly the point. We got the same issue on different rrdtool versions (Debian 9 ---> rrdtool 1.3.0, Debian 10 ---> rrdtool 1.7.1-2) so rrdtool has nothing to do with that.

Honestly I'm unclear about it but I'am not the only one. There are other people currently escalating the same issue like the following:

https://github.com/jonathanio/monitoring-munin-haproxy/issues/5

but nobody has replied at today. It would be very kind of you if you can stress the topic with him ("jonathanio" ).

As usual, thank you for your time and for your help.

sumpfralle commented 4 years ago

Interesting - so the problem should not be just local ...

In order to ease debugging: is it possible to retrieve the status URL remotely? Maybe you could give me access to such a living status URL? (send me a private message, if this helps) Otherwise I would need to set it up on my own.

lord-syrius commented 4 years ago

Hello @sumpfralle I have tried to ask but I cannot give you access to the URL remotely , unfortunately.

Are you able to set up it on your end?

Thanks.