occitech / docker

Docker images we use at Occitech
31 stars 36 forks source link

munin : Socket read from munin failed #25

Closed laf1greg closed 3 years ago

laf1greg commented 4 years ago

Hi thanks for your work however I'm new to munin and can't make it work on my local computer:

sudo docker run \
 --name=munin \
 -p 8080:80 \
 -e THISNODENAME="myservice" \
 -e TZ="UTC" \
 -e CRONDELAY=1 \
 -v /data/munin/db:/var/lib/munin \
 -v /data/munin/logs:/var/log/munin \
 -v /data/munin/cache:/var/cache/munin \
 munin:latest

I get this:

2019/11/27 22:38:12 [INFO] Remaining workers: munin;munin
2019/11/27 22:38:12 [INFO] Reaping Munin::Master::UpdateWorker<munin;munin>.  Exit value/signal: 18/0
2019/11/27 22:38:12 [INFO] No old data available for failed worker munin;munin.  This node will disappear from the html web page hierarchy
2019/11/27 22:38:12 [INFO]: Munin-update finished (10.00 sec)
2019/11/27 22:40:01 [INFO]: Starting munin-update
2019/11/27 22:40:01 [INFO] Process 681 is dead, stealing lock, removing file
2019/11/27 22:40:01 [INFO] starting work in 691 for munin/172.17.0.1:4949.
2019/11/27 22:40:01 [FATAL] Socket read from munin failed.  Terminating process. at /usr/share/perl5/Munin/Master/UpdateWorker.pm line 254.
2019/11/27 22:40:01 [ERROR] Munin::Master::UpdateWorker<munin;munin> died with '[FATAL] Socket read from munin failed.  Terminating process. at /usr/share/perl5/Munin/Master/UpdateWorker.pm line 254.
'
2019/11/27 22:40:11 [INFO] Remaining workers: munin;munin
2019/11/27 22:40:11 [INFO] Reaping Munin::Master::UpdateWorker<munin;munin>.  Exit value/signal: 18/0
2019/11/27 22:40:11 [INFO] No old data available for failed worker munin;munin.  This node will disappear from the html web page hierarchy
2019/11/27 22:40:11 [INFO]: Munin-update finished (10.00 sec)

No error when I do this :

su -s /bin/bash munin
/usr/share/munin/munin-update --debug --nofork --host localhost --service cpu

And by the way I had to manually create 2 folders in the volume to avoid run.sh errors: /var/cache/munin/www /var/lib/munin/cgi-tmp

faust64 commented 3 years ago

Same here, Munin is failing to scrape to scrape metrics from its own munin-node.

The default munin-node.conf includes a static IP address:

# cat munin.conf 
includedir /etc/munin/munin-conf.d

# local host
[localhost.localdomain]
    # docker gateway IP = host server
    address 172.17.0.1

While the run.sh would sed that hostname, it won't replace the IP. In my case, that IP is the one for my docker host - though it might not be yours. Running locally, I can just expose port 4949, alongside port 80.

Next, you'ld have an issue with munin-node denying connections fro the munin server. The default would only allow lookback connections.

And as you pointed out, there's something wrong with the volumes. I could get it to start removing the /var/lib/munin and /var/cache/munin ones. Still seeing errors about /var/log/munin:

# cat /var/lib/munin/dead.letter 
Can't open /var/log/munin/munin-limits.log (Permission denied) at /usr/share/perl5/Log/Log4perl/Appender/File.pm line 151.
Can't open /var/log/munin/munin-html.log (Permission denied) at /usr/share/perl5/Log/Log4perl/Appender/File.pm line 151.

For what it's worth, here's how to patch the run.sh, to reconfigure munin-node, munin-server, fix permission issues and track all logs:

$ git diff
diff --git a/munin/run.sh b/munin/run.sh
index 411f8b5..52b9411 100644
--- a/munin/run.sh
+++ b/munin/run.sh
@@ -13,7 +13,9 @@ sed -i "s/\*\/5/\*\/$CRONDELAY/g" /etc/cron.d/munin

 # configure default node name
 THISNODENAME=${THISNODENAME:="munin"}
-sed -i "s/^\[localhost\.localdomain\]/\[$THISNODENAME\]/g" /etc/munin/munin.conf
+sed -i -e "s/^\[localhost\.localdomain\]/\[$THISNODENAME\]/g" \
+    -e "s/^[ \t]*address.*/    address 127.0.0.1/g" \
+    /etc/munin/munin.conf

 # configure default servername
 THISSERVERNAME=${SERVERNAME:="munin"}
@@ -67,6 +69,9 @@ else
   rm /etc/munin/munin-conf.d/munin_slack.conf
 fi

+sed '/^allow/d' /etc/munin/munin-node.conf
+echo 'allow ^.*$' >>/etc/munin/munin-node.conf
+
 # generate node list
 NODES=${NODES:-}
 for NODE in $NODES
@@ -93,11 +98,12 @@ if [ ! -f /var/cache/munin/www/index.html ]; then
   </body>
 </html>
 EOF
-    chown -R munin: /var/cache/munin/www/index.html
 fi
+chown -R munin: /var/cache/munin/www

 # ensure munin fle have right permission

+mkdir -p /var/lib/munin/cgi-tmp
 chown -R munin:munin /var/lib/munin
 chmod -R ugo+rw /var/lib/munin/cgi-tmp

@@ -116,6 +122,9 @@ echo " $NODES"
 /usr/sbin/apache2ctl start

 # display logs
-touch /var/log/munin/munin-update.log
-chown munin:munin /var/log/munin/munin-update.log
-tail -f /var/log/munin/munin-*.log
+for f in graph update limits html
+do
+    touch /var/log/munin/munin-$f.log
+    chown munin:munin /var/log/munin/munin-$f.log
+done
+tail -f /var/log/munin/munin-*.log /var/log/apache2/*.log

Which you would start with:

docker run \
    --name=munin \
    -p 8080:80 \
    -p 4949:4949 \
    -e THISNODENAME="myservice" \
    -e TZ="UTC" \
    -e CRONDELAY=1 \
    -v `pwd`/data/db:/var/lib/munin \
    -v `pwd`/data/cache:/var/cache/munin/www \
    -v `pwd`/data/logs:/var/log/munin \
    munin