munin-monitoring / munin

Main repository for munin master / node / plugins
http://munin-monitoring.org
Other
1.99k stars 474 forks source link

reload for munin-node/munin regarding logging #1571

Open Zugschlus opened 1 year ago

Zugschlus commented 1 year ago

As the Debian maintainer of aide, the Advanced Intrusion Detection Environment, I have stumbled upon munin-node's method of logging which seems to be simply dumping info into /var/log/munin/munin-node.log.

The munin Debian packages use logrotate to rotate those logs using the copytruncate option so that munin-node can be kept running and still handles the log correctly.

Unfortunately, this doesn't work well with aide since there is no possibility to handle the log properly in aide.

Removing the copytruncate option from logrotate doesn't work since munin-node will happliy continue logging to the old file.

It is currently necessary to use a postrotate script to restart munin-node. I do not know how much local context is lost with a daily restart.

Would it be possible to either

Greetings Marc

mackuba commented 2 months ago

Hey!

So, I'm moving from Ubuntu 22.04 LTS to 24.04 LTS now. I've installed Munin there, and I've noticed something weird - every day exactly at midnight some of my charts are missing one data point… Journal showed that the munin node is restarted every day, which didn't happen on my old system, but I couldn't figure out who does that.

I've eventually narrowed it down to logrotate, which somehow had a restart command in its munin config which wasn't there before… 🙃

I then tracked down the specific change in Ubuntu's changelogs in their bug tracker, which pointed me to a Debian mailing list, which is how I ended up on your email thread and here 😅

I'm assuming you haven't seen something like this? Any idea why it works like this and what's the best fix? (I mean I'm just going to revert the config change locally, but I wonder what I should suggest to upstream)

Screen Shot 2024-09-04 at 03 01 46 Screen Shot 2024-09-04 at 03 02 25 Screen Shot 2024-09-04 at 03 02 15 Screen Shot 2024-09-04 at 03 02 00
niclan commented 2 months ago

@mackuba : the log rotate call is needed because of the way munin node currently works as explained by @Zugschlus . So don't remove it.

Not sure why the gap appears. But when munin-node restarts it runs all the plugins with "config". This is to map out what hostname each plugin answers for. Running config for all of them can take some time and therefore cause munin-node to be non-responsive for a while.

mackuba commented 2 months ago

@niclan Yeah, but this happens at the exact time when Munin tries to collect data from nodes, since it runs every 5 minutes at :00, so also at 0:00… so if the node is unresponsive at that moment, it misses its data, as I understand.

The old config with copytruncate and no restart command in postrotate works fine for me on the servers running Ubuntu 22.04, so I'm just going to copy that old logrotate config from that version.