Open shallot opened 3 years ago
One more thing. It's particularly weird that this happens, but in the ~munin-async/ directory there's nothing saved about those remote host plugins, even for the good nodes.
Could it be that munin-asyncd is just doing something like config
, realizing the hostname doesn't match, and skipping? But when config calls involve remote host calls, it still gets hung up on that?
Hi,
For a few years now I've been using some set of Munin plugins from their website, https://auth.mikrotik.com/wiki/Munin_Monitoring
Essentially what these seem to do is define nodes with address 127.0.0.1, but then use the plugin link name to deduce the actual hostname to which to send the requests
This is practically the same as what https://gallery.munin-monitoring.org/plugins/munin-contrib/snmp__mikrotik/ does, parsing the hostname to connect to out of
$0
, the script's own filenameThis by and large works, but there's a significant problem when one of the remote notes goes offline. First, the code typically starts using default long timeouts of e.g. 30s. This is practically untenable with shorter
update_rate
(I used 60s). But even after you reduce that significantly (I used 3s), the problem is still compounded by the fact there's many separate small plugins generating individual graphs, instead of there being one bigmultigraph
plugin that experiences network problems only once per session.But I somehow made it work, and it was acceptable over a period of many years. Worst case, I had to mark the dead hosts with
update no
, and that would alleviate any issues. I don't think I ever filed this as an issue here because it seemed like just a fact of life.However, then I introduced munin-asyncd into the picture recently, and now the problem appears to be back, but worse - even when I make munin-update stop connecting to munin-node for the dead nodes, munin-asyncd on localhost keeps trying to do it itself, and chokes.
It's worse than the original method, as the whole thing becomes so lagged, I actually get timeouts on async calls and lose data from localhost. I noticed this through the fact that localhost munin_stats and munin_update plugins went missing from munin-html output.
I actually had to move away /etc/munin/plugins/mikrotik* links for dead hosts in order to unclog that.
I think the original use case works better because munin-node recognizes the distinction between just
list
andlist remote.host
, but munin-asyncd seems to keep hammering everything defined on localhost, regardless of any exceptions in the master config.Can something be done about this?
TIA