Open amcmorris-piksel opened 4 years ago
Hi @amcmorris-piksel
Are you setting up a new ping check or is this the default ping check on the icinga server?
@jjethwa This was a new ping check, code below:
object Host "NAME" { address = "FQDN" check_command = "hostalive" }
Nothing complex, wonder if doing something silly, command below works okay from the root account and tried from nagios account and got the ^ error.
Plugin Output /bin/ping -4 -n -U -w 30 -c 5 FQDN CRITICAL - Could not interpret output from ping command
Does the default icinga2 server hostalive check work?
The URL is http://
That uses the hostalive check_command as well
Yes unfortunatly also getting the error on that with the following output: :(
/bin/ping -4 -n -U -w 30 -c 5 127.0.0.1 CRITICAL - Could not interpret output from ping command
Unsure what is going on, any idea of next steps?
Bit more info, on the same Docker Host have done a diff test: docker run -p 8080:80 -h icinga2 -t jordan/icinga2:latest
And looks like getting the same output as above, also getting:
Check execution Reachable | no
Happy to provide or try anything needed.
Thanks for the details @amcmorris-piksel I pulled latest but don't see the same issue unfortunately. It looks like the ping check is configured to use /usr/lib/nagios/plugins/check_ping
The full command is:
'/usr/lib/nagios/plugins/check_ping' '-4' '-H' '127.0.0.1' '-c' '200,15%' '-w' '100,5%'
Thanks for that, just tried the below on a fresh image.
root@icinga2:/usr/lib/nagios/plugins# sudo -u nagios /usr/lib/nagios/plugins/check_ping '-4' '-H' '127.0.0.1' '-c' '200,15%' '-w' '100,5%' /bin/ping -4 -n -U -w 10 -c 5 127.0.0.1 CRITICAL - Could not interpret output from ping command
I think this is an issue with the Docker host from some searching around: https://github.com/jjethwa/icinga2/issues/52
Just not sure what the equivalent will be to get this working in Ubuntu 16.04
Ah, I had forgotten about that issue. Try adding the --privileged flag to the docker run command and see if that works
Thanks, wish that worked, tried: docker run --rm --privileged --cap-add=ALL -p 8080:80 -h icinga2 -t jordan/icinga2:latest
But got:
[2020-06-17 14:29:50 +0000] warning/PluginNotificationTask: Notification command for object 'icinga2' (PID: 2297, arguments: '/etc/icinga2/scripts/mail-host-notification.sh' '-4' '127.0.0.1' '-6' '::1' '-b' '' '-c' '' '-d' '2020-06-17 14:29:50 +0000' '-l' 'icinga2' '-n' 'icinga2' '-o' '/bin/ping -4 -n -U -w 30 -c 5 127.0.0.1 CRITICAL - Could not interpret output from ping command' '-r' 'root@localhost' '-s' 'DOWN' '-t' 'PROBLEM' '-v' 'false') terminated with exit code 36, output: /etc/icinga2/scripts/mail-host-notification.sh: 148: [: false: unexpected operator mail: cannot send message: Process exited with a non-zero status
Just does not like this version of docker it looks like. :(
So bizarre. Maybe you can try running it on one of the container Linux distros like Flatcar?
Going to move the PoC to AWS rather than use our on premises Docker Hosts, thanks for the help.
I had this happen to me as well with CentOS. One of the symptoms were that the ping processes were not being terminated properly and ended up as zombie processes. This would go on until eventually there were no resources available.
I never solved it but hope this information helps.
Thanks for the tip, @adamparker
Would you be able to test out adding a timeout to your ping config to see if that gets rid of the zombies?
We're having the same issue on ubuntu 20.04 with no internet access. We have the exact same setup in a Vagrant which works (even without the internet access).
It seems to be a rights issue (still not sure why it works on some machines and not on others):
root@icinga2:/# usermod nagios --shell /bin/bash
root@icinga2:/# su - nagios
nagios@icinga2:~$ /bin/ping 127.0.0.1
ping: socket: Operation not permitted
nagios@icinga2:~$ logout
root@icinga2:/# /bin/ping 127.0.0.1
PING 127.0.0.1 (127.0.0.1) 56(84) bytes of data.
64 bytes from 127.0.0.1: icmp_seq=1 ttl=64 time=0.034 ms
64 bytes from 127.0.0.1: icmp_seq=2 ttl=64 time=0.027 ms
^C
--- 127.0.0.1 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 28ms
rtt min/avg/max/mdev = 0.027/0.030/0.034/0.006 ms
looking online for a solution gave me the following:
chmod u+s /bin/ping
but this doesn't seem to work:
root@icinga2:/# chmod u+s /bin/ping
root@icinga2:/# su - nagios
nagios@icinga2:~$ /bin/ping 127.0.0.1
ping: socket: Operation not permitted
someone suggested changing the langauge of the system but it's already set to nothing.
Looking at the rights on both the server and in the vagrant:
vagrant: 543757 -rwsr-sr-x 1 root root 69368 Jan 13 2020 ping
server: 8357416 -rwsr-sr-x 1 root root 69368 Jan 13 2020 ping
I've also looked into the docker versions:
Vagrant:
Client: Docker Engine - Community
Version: 20.10.2
API version: 1.41
Go version: go1.13.15
Git commit: 2291f61
Built: Mon Dec 28 16:17:43 2020
OS/Arch: linux/amd64
Context: default
Experimental: true
Server: Docker Engine - Community
Engine:
Version: 20.10.2
API version: 1.41 (minimum version 1.12)
Go version: go1.13.15
Git commit: 8891c58
Built: Mon Dec 28 16:15:19 2020
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.4.3
GitCommit: 269548fa27e0089a8b8278fc4fc781d7f65a939b
runc:
Version: 1.0.0-rc92
GitCommit: ff819c7e9184c13b7c2607fe6c30ae19403a7aff
docker-init:
Version: 0.19.0
GitCommit: de40ad0
and the server:
Client: Docker Engine - Community
Version: 20.10.1
API version: 1.41
Go version: go1.13.15
Git commit: 831ebea
Built: Tue Dec 15 04:34:58 2020
OS/Arch: linux/amd64
Context: default
Experimental: true
Server: Docker Engine - Community
Engine:
Version: 20.10.1
API version: 1.41 (minimum version 1.12)
Go version: go1.13.15
Git commit: f001486
Built: Tue Dec 15 04:32:52 2020
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.4.3
GitCommit: 269548fa27e0089a8b8278fc4fc781d7f65a939b
runc:
Version: 1.0.0-rc92
GitCommit: ff819c7e9184c13b7c2607fe6c30ae19403a7aff
docker-init:
Version: 0.19.0
GitCommit: de40ad0
Hi @Thixx
Thanks for all the details, I have not been able to track this down myself. I believe that it is coming down to how the host is handling the socket request. So far I have not run into the issue when using Flatcar as it's the main distro I use for docker containers.
Hi,
I switched check_ping with check_icmp which has resolved the issue for me.
Check_ping also gave me trouble with Zombie processes which is described here https://community.icinga.com/t/defunct-zombie-ping-processes-when-using-check-ping-on/7012
That's great news, thanks for the update @adamparker 😃
Hi,
I switched check_ping with check_icmp which has resolved the issue for me.
Check_ping also gave me trouble with Zombie processes which is described here https://community.icinga.com/t/defunct-zombie-ping-processes-when-using-check-ping-on/7012
I wish that would work for me, but most of the commands can't be used because of the same issue... (check_icmp included) Also @jjethwa I just can't switch to another OS, kind of stuck with Ubuntu for now. I'm still looking into it.
Thanks for the update, @Thixx I haven't had time to research more, but I still feel that we need to focus on the host. Could be a tweak to the docker daemon or an OS security setting.
Thanks for the update, @Thixx I haven't had time to research more, but I still feel that we need to focus on the host. Could be a tweak to the docker daemon or an OS security setting.
Yeah, I think you're right! I've seen related issues in suze and centos that are solved down the road. I've found out that selinux isn't the problem and that I can't add capabilities to the container... or at least it looks like it 'forgets' them.
Although this is older, but still open.
Just installed Icinga2 in an Ubuntu 20.04 LTS LXC (Proxmox) and ran into the same issue.
I finally found out that check_ping calls /bin/ping and the user nagios used by Icinga2 could not exute the ping command.
nagios@monitor:/usr/lib/nagios/plugins$ /bin/ping 127.0.0.1
/bin/ping: socket: Operation not permitted
I found in a different threat to execute
setcap cap_net_raw+p /bin/ping
and after this command, the problem was solved.
Hi @AlphaDE
Thanks so much for the details! Adding it to the Dockerfile 😄
FYI, I've run into a similar problem. (I dont use your Dockerfile)
Many distros removed both the s-bit and capabilities to the executable of ping, sometimes relying on other methods to grant users access.
Also, container systems (docker, podman, etc.) have a role, in removing capabilities to the container as a whole.
Here's what I had to do in my Dockerfile:
RUN setcap 'cap_net_raw+ep' /usr/bin/ping
and run the container with
podman run --network slirp4netns:allow_host_loopback=true --cap-add=cap_net_raw ...
(from standard user, not root)
Hope it helps.
Thanks for the tip @TheMule71 😃
Setting up a new installation and having issues with Ping.
I am getting the following message in the console: CRITICAL - Could not interpret output from ping command
When do from the command line under root it works, but if I try under nagios I get this error: ping: socket: Operation not permitted
Anyone else seen this before? I am fairly new to Icinga so just getting my feet together with it.
A.