NagiosEnterprises / nrpe

NRPE Agent
GNU General Public License v2.0
263 stars 134 forks source link

Possible issue with nrpeV3 support -- random failures in nrpe? #160

Closed dlacroix17 closed 7 years ago

dlacroix17 commented 7 years ago

I am testing on RHEL6u4 -- with nrpe running under xinetd

I've built nrpe and check_nrpe based off one commit past 3.2.0 ... Git commit: 005e20ffec489bb56871911463d41ca5a465583d

I have one check -- with no parameters at all running in a loop on one box and I see the following in my logs:

Aug 30 19:04:57 dev-xxxx nrpe[4967]: Error: Request contained command arguments! Aug 30 19:04:57 dev-xxxx nrpe[4967]: Client request from was invalid, bailing out... Aug 30 19:04:57 dev-xxxx check_nrpe: Remote 127.0.0.1 does not support Version 3 Packets

nrpe.cfg is this: server_port=5665 include_dir=/usr/local/nrpe/etc/nrpe.d command_timeout=60 nrpe_user=nagios nrpe_group=nagios debug=0 dont_blame_nrpe=0 allow_weak_random_seed=1 ssl_version=TLSv1.2+

I will try the lastest from maint and see if it has the same issue. Can also try same test on more current OS.

hedenface commented 7 years ago

What happens if you change the command sequence up a bit? Do you get any other meaningful output?

dlacroix17 commented 7 years ago

I haven't seen any more meaningful output ... I set the debug=1 in nrpe.cfg and only got additional messages about SSL and seteuid ...

Aug 30 18:59:03 dev-xxxx nrpe[10341]: INFO: SSL/TLS initialized. All network traffic will be encrypted. Aug 30 18:59:03 dev-xxxx nrpe[10342]: WARNING: my_system() seteuid(0): Operation not permitted Aug 30 18:59:03 dev-xxxx nrpe[10341]: Error: Request contained command arguments! Aug 30 18:59:03 dev-xxxx nrpe[10341]: Client request from was invalid, bailing out... Aug 30 18:59:03 dev-xxxx check_nrpe: Remote 127.0.0.1 does not support Version 3 Packets

dlacroix17 commented 7 years ago

This is the command I'm running in a loop:

[deploy@dev-xxxx ~]$ /usr/local/nrpe/libexec/check_nrpe -H 127.0.0.1 -c log_testing_long_messages No matches found [deploy@dev-xxxx ~]$

dlacroix17 commented 7 years ago

I just confirmed on a RHEL6u8 box ... same issue.

I did notice however that it still seems to produce the correct output for all iterations of the loop.

$ tail -F /var/log/messages & $ while (/usr/local/nrpe/libexec/check_nrpe -H 127.0.0.1 -c log_testing_long_messages); do echo . ; done [skip ahead many iterations] No matches found . No matches found . No matches found . Aug 30 20:08:45 host-192-168-99-32 nrpe[18867]: Error: Request contained command arguments! Aug 30 20:08:45 host-192-168-99-32 nrpe[18867]: Client request from was invalid, bailing out... Aug 30 20:08:45 host-192-168-99-32 check_nrpe: Remote 127.0.0.1 does not support Version 3 Packets No matches found . No matches found . No matches found .

dlacroix17 commented 7 years ago

Ok ... same behavior on RHEL6u8 ... But can't reproduce on Centos 7u3 ... Will try switching to running as a daemon ...

dlacroix17 commented 7 years ago

on 6u8 ... running as a daemon (from command line) seems to not show this.

running as a daemon on 6u4 also seems to make this go away.

So, it seems xinetd hands off something to nrpe -- which causes these messages to be logged.

This happens to be in Openstack ... doubt that's a factor here though.

hedenface commented 7 years ago

Did you configure with --enable-command-args? It looks like the error messages are a bit misleading. But it's working?

Also the seteuid stuff should be fixed for inetd in one of the latest commits on maint. I'll be pushing 3.2.1 today - try that - I think it'll fix at least one of these issues.

Otherwise, I'm going to close the issue - if you still have the same problems after 3.2.1 please re-open it.

dlacroix17 commented 7 years ago

Further datapoint ...

If I run my loop with check_nrpe with -2 in the argument list, the issue doesn't happen.

Skywalker-11 commented 6 years ago

I got the same error with WARNING: my_system() seteuid(0): Operation not permitted for nagios-nrpe-server 3.2.1-1ubuntu1 (Ubuntu 18.04). It happend when using only the name of the check script for a command (assuming it was in $PATH). Using the absolute path for the test script solved this for me. Not sure if it is related though. The command didn't take any arguments and none where supplied.