Open thatsk opened 5 years ago
not sure what is wrong. @timdaman
any help will be appreciated. i am running python3 inside docker container and container will be treated as binary to get the health check
Hi @thatsk
The common cause for the error message "NRPE: Unable to read output" is a wrong plugin path in your nrpe.cfg but it could also be a interpreter issue.
I for example had to change the shebang line at the very top of the plugin from
#!/usr/bin/env python3
to
#!/usr/bin/python3.6
for the plugin to work.
You could also run ls -lah /usr/bin/* | grep python
to see what binary versions of python are installed and amend the shebang in the check_docker file.
Have a look at this KB entry from nagios: https://support.nagios.com/kb/article/nrpe-nrpe-unable-to-read-output-620.html
Sorry, I was busy with....life. @HigH-HawK, thanks for the observation. That sound plausible.
Looking at the command output above I assume the one below is the NRPE command installed on the host being monitored.
clientnode:#command[check_docker]=sudo
docker run --rm -v /var/run/docker.sock:/var/run checkdocker --cpu $ARG1$:$ARG2$`
I am guessing you installed check_docker in a image called checkdocker
and that you set the entrypoint to check_docker
.
I recommend try to manually run you docker image with no arguments and confirm it works, sudo docker run --rm -v /var/run/docker.sock:/var/run checkdocker
. If you get the help text then I think likely you are looking at a configuration issue in NRPE. If you get some other output (or none at all) then I would look at your entrypoint and confirm it looks good.
Please feel free to report back what you see in those tests and I will try to help. Also, sending me the Dockerfile for you checkdocker
image would help me recreate you environment.
Hey @timdaman
I didn't have any issues, just tried helping the other user :)
Hello, @timdaman I'm trying to integrate nagios and docker via your check_docker script and seeing error like this - "NRPE: unable to read output". If i'm issuing check_docker via command line it works fine, but seems like it reports wrong return code and that's why nagios can't handle it properly. From my syslog - Dec 2 14:46:53 plat_doc nrpe[20573]: Host 192.168.1.133 is asking for command 'check_docker' to be run... Dec 2 14:46:53 plat_doc nrpe[20573]: Running command: /usr/local/bin/check_docker --connection /var/run/docker.sock --health Dec 2 14:46:53 plat_doc nrpe[20573]: Command completed with return code 1 and output: Dec 2 14:46:53 plat_doc nrpe[20573]: Return Code: 3, Output: NRPE: Unable to read output
Hi @Aeris126
Please could you check the file permissions of the check_docker file? Since you said, that if your are running the command from the machine itself, it works, I would imagine that the nrpe
or nagios
user has no permission to run the command when being requested remotely.
@HigH-HawK -rwxr-xr-x 1 root root 228 Nov 29 17:14 /usr/local/bin/check_docker It is weird for me that on one line it says return code - 1, and on another - 3
The permissions look ok. As for the return codes, the first one is the return code for the check_docker
command and the second one is the follow up from NRPE / Nagios because the command already returned 1.
i was getting the same error In nrpe.cfg i gave the whole path of check_docker
command[check_docker]=/usr/local/bin/check_docker --containers --status running
it worked for me