NagiosEnterprises / nrpe

NRPE Agent
GNU General Public License v2.0
257 stars 133 forks source link

NRPE: Unable to read output #238

Open rosri1992 opened 4 years ago

rosri1992 commented 4 years ago

Hello Team,

I'm trying to set up monitoring of Kubectl cluster using this article https://github.com/colebrooke/kubernetes-nagios

I came across a weird Issues. I have the Kubectl cluster running on Remote RHEL server. When I try to run the scripts locally using NRPE it works. From Remote server locally. /usr/local/nagios/libexec/check_pods.sh -k -n -w 500 -C 800 OK - pods are all OK, found 2 in state.

Same command using nrpe plugin on remote server too /usr/local/nagios/libexec/check_nrpe -H localhost -c check_pod_cjoc NRPE: Unable to read output

So I have defined a command definition in nrpe.cfg & restarted NRPE agent on the Remote server.

When I try to invoke this script from Nagios server. I'm getting "NRPE: Unable to read output" error.

From Nagios Server /usr/local/nagios/libexec/check_nrpe -H -c check_pod_cjoc NRPE: Unable to read output

I have tested with two versions of NRPE agent i.e 3.2.1 & 4.0.3, I didn't try with other versions, but getting same error message

Note: Nagios user has admin(sudo) rights to run these scripts on Remote server.

Nagios running is running on v4.4.5 on RHEL server.

Let me know if you need more information. Can you guys please look at it. @ericloyd @sawolf

Stay Home#######Stay Safe

Thanks, Srikanth # #

rosri1992 commented 4 years ago

I found this in NRPE log file

[1589291681] Host X.X.X.X is asking for command 'check_pod_cjoc' to be run... [1589291681] Running command: /usr/local/nagios/libexec/check_pods.sh -k config -n cloudbees-core -w 500 -C 800 [1589291681] WARNING: my_system() seteuid(0): Operation not permitted [1589291681] Command completed with return code 3 and output: [1589291681] Return Code: 3, Output: NRPE: Unable to read output [1589291681] Connection from X.X.X.X closed.

Let me know if you need more information

rosri1992 commented 4 years ago

The Issue which I have mentioned above is seen only from Nagios server.

The Command works fine on Remote server and gives expected output from Kubectl cluser.

When Nagios server tries to run this script on Remote server using check_nrpe plugin, it fails and give me NRPE Unable to read output error''

Note : NAgios user on Remote server has sudo permissions.

@ericloyd @sawolf Can you guys please help me here

sawolf commented 4 years ago

The "Unable to read output" message usually occurs because the plugin didn't return any output after running. I would recommend spending some time debugging the plugin that you're using to figure out when/why it's not returning anything. I don't suspect an issue with the NRPE code at this point.

If you need further assistance, I would recommend using the Community Support Forums, or one of the customer forums if you're paying for support.

rosri1992 commented 4 years ago

Hi,

@sawolf Please can you check the finding which I mentioned below.

Till yesterday, I was passing parameters to the command definition in nrpe.cfg in Remote server which is below:

"command[check_pod_cjoc]=/usr/local/nagios/libexec/check_pods.sh -k /.kube/config -n cloudbees-core "

When I try from Nagios server, I was getting this error "NRPE: Unable to read output" which is below:

[nagios@nagioserver nagios]$ /usr/local/nagios/libexec/check_nrpe -H Remote-Server -c check_pod_cjoc NRPE: Unable to read output

Today I tried something different. I tried passing the parameters as arguments in nrpe.cfg file in remote server which is below:

"command[check_pod]=/usr/local/nagios/libexec/check_test.sh -k $ARG1$ -n $ARG2$"

Restarted NRPE service on remote server.

When I try from Nagios server, I'm getting this error now which is different.

[nagios@nagios-host ~]$ /usr/local/nagios/libexec/check_nrpe -H remote-host -c check_pod -a /home/ec2-user/.kube/config cloudbees-core NRPE: Command 'check_pod!/home/ec2-user/.kube/config!cloudbees-core' not defined

Note: check_test.sh & Check_pods.sh are same scripts which are copied from this repository "https://github.com/colebrooke/kubernetes-nagios" which same as check_kube_pods.sh script.

I have tried every work-around to get that command working but I don't what is wrong. I couldn't figure out the root cause why nagios is unable to capture the output when ever the check_pods.sh run on remote server.

I'm going to paste the ideal output of the script when I execute it in locally on remote server.

[nagios@remote-server ~]$ /usr/local/nagios/libexec/check_test.sh -k /h/.kube/config -n cloudbees-core -v OK - pods are all OK, found 3 in ready state. OK: Pod: ny-master-0 PodScheduled: True OK: Pod: ny-master-0 ContainersReady: True OK: Pod: ny-master-0 Ready: True OK: Pod: ny-master-0 Initialized: True OK: Pod: heist-master-2-0 PodScheduled: True OK: Pod: heist-master-2-0 ContainersReady: True OK: Pod: heist-master-2-0 Ready: True OK: Pod: heist-master-2-0 Initialized: True OK: Pod: cjoc-0 PodScheduled: True OK: Pod: cjoc-0 ContainersReady: True OK: Pod: cjoc-0 Ready: True OK: Pod: cjoc-0 Initialized: True [nagios@remote-server ~]$

Let me know if anyone needs more information.

Thanks, Srikanth

rosri1992 commented 4 years ago

Hi,

Thanks for getting back to me so quickly. @sawolf I have already a raised a support request in nagios community forums.

Please check it here in this link : https://support.nagios.com/forum/viewtopic.php?f=7&t=58573

Let me know if you need more information.

Thanks, Srikanth

rosri1992 commented 4 years ago

Hi

@sawolf I have even raised this issue with the plugin developer. I'm adding the issue which I created in this repository.

https://github.com/colebrooke/kubernetes-nagios/issues/10

Let me know if you need more information

Thanks, Srikanth

sawolf commented 4 years ago

Sorry for the lack of response, this got buried in my other notifications.

When you set dont_blame_nrpe to 1, did you also remember to recompile with ./configure --enable-command-args? I ask because it looks like your last command wasn't broken up properly.

rosri1992 commented 4 years ago

Hi @sawolf

I have recomplied the nrpe with this argument ./configure --with-nrpe-user=nagios --with-nrpe-group=nagios --enable-command-args and restarted nrpe on remote server. When I try to trigger a check to the command which is defined in nrpe.cfg with arguments I'm still getting command not defined error from Nagios server.

command defined in nrpe.cfg file in remote server. command[check_pod_test]=/usr/local/nagios/libexec/check_test.sh -k $ARG1$ -n $ARG2$

and when we try to trigger the command from nagios host :

/usr/local/nagios/libexec/check_nrpe -H remote-host -c check_pod_test -a -k /home/user/.kube/config -n cloudbees-core NRPE: Command 'check_pod_test!-k!/home/ec2-user/.kube/config!-n!cloudbees-core' not defined

Let me know if you need more information.