Closed hariwe closed 4 years ago
Thanks for reporting this.
I haven't been able to reproduce this issue so far. Running under gdb with detach-on-fork off
, follow-fork-mode child
, and breakpoints set for SSL_free
and abort
, I do encounter SSL_free
but never abort
.
Does the issue occur if you run systemd's command (/opt/nagios/plugins/bin/nrpe -c /opt/nagios/plugins/etc/nrpe.cfg -f
) in your terminal?
Do you get any log messages when trying to run commands?
~What distribution are you running?~ I'm running CentOS 7 on this machine, so I wouldn't expect any different behavior there.
Hi,
I cannot reproduce this when I'm running nrpe directly in the commandline (as user nagios), so it seems related to systemd. My service file:
[Service] Type=simple Restart=on-abort ExecStart=/opt/nagios/plugins/bin/nrpe -c /opt/nagios/plugins/etc/nrpe.cfg -f ExecReload=/bin/kill -HUP $MAINPID User=nagios Group=nagios
I was finally able to reproduce the issue in Valgrind using the latest master branch.
The Valgrind output is as follows:
==8227== Invalid write of size 1 ==8227== at 0x4C2D0F3: strcpy (vg_replace_strmem.c:513) ==8227== by 0x40758F: handle_connection (nrpe.c:1927) ==8227== by 0x40668A: wait_for_connections (nrpe.c:1441) ==8227== by 0x4047FC: run_src (nrpe.c:642) ==8227== by 0x403CF5: main (nrpe.c:224) ==8227== Address 0x75cf438 is 0 bytes after a block of size 88 alloc'd ==8227== at 0x4C2BF79: calloc (vg_replace_malloc.c:762) ==8227== by 0x4074FC: handle_connection (nrpe.c:1919) ==8227== by 0x40668A: wait_for_connections (nrpe.c:1441) ==8227== by 0x4047FC: run_src (nrpe.c:642) ==8227== by 0x403CF5: main (nrpe.c:224)
This patch fixes the issue:
--- nrpe-4.0.0/src/nrpe.c 2020-01-15 17:01:48.000000000 +0100
+++ nrpe-4.0.0.patched/src/nrpe.c 2020-02-27 13:59:56.562148344 +0100
@@ -1912,9 +1912,9 @@
} else {
- pkt_size = (sizeof(v3_packet) - NRPE_V4_PACKET_SIZE_OFFSET) + strlen(send_buff);
+ pkt_size = (sizeof(v3_packet) - NRPE_V4_PACKET_SIZE_OFFSET) + strlen(send_buff) + 1;
if (packet_ver == NRPE_PACKET_VERSION_3) {
- pkt_size = (sizeof(v3_packet) - NRPE_V3_PACKET_SIZE_OFFSET) + strlen(send_buff);
+ pkt_size = (sizeof(v3_packet) - NRPE_V3_PACKET_SIZE_OFFSET) + strlen(send_buff) + 1;
}
v3_send_packet = calloc(1, pkt_size);
send_pkt = (char *)v3_send_packet;
@@ -1923,7 +1923,7 @@
v3_send_packet->packet_type = htons(RESPONSE_PACKET);
v3_send_packet->result_code = htons(result);
v3_send_packet->alignment = 0;
- v3_send_packet->buffer_length = htonl(strlen(send_buff));
+ v3_send_packet->buffer_length = htonl(strlen(send_buff) + 1);
strcpy(&v3_send_packet->buffer[0], send_buff);
/* calculate the crc 32 value of the packet */
I'm really surprised that the patch shown here would fix the original issue. That said, I do think it's a good change regardless. I can verify the valgrind issue affects my development machine as well.
Thanks! The previous pull request #228 made it better and it happenend not everytime, but still occasionally. This fixes it now completely for me.
Affects Version: 4.0.0 with latest pull request #225 OS: RHEL 7
I'm running nrpe with systemd. Everytime a check is executed the spawned nrpe process gets terminated by signal 6. However, check_nrpe gets a result and is working fine.
GDB Core Output: