NagiosEnterprises / ncpa

Nagios Cross-Platform Agent
Other
176 stars 95 forks source link

NCPA 3.0 service fails on Linux with IPv6 disabled #1056

Closed shodgson11 closed 7 months ago

shodgson11 commented 7 months ago

If IPv6 is disabled, the NCPA 3.0 service won't start. The error shown in the service status when trying to start the service is OSError: [Errno 97] Address family not supported by protocol.

If you edit the config file and specify ip=0.0.0.0, then the service will start.

This bug seems to have been fixed years ago, but has resurfaced.

ne-bbahn commented 7 months ago

Thanks for identifying the cause of this problem. I'll try to get it fixed for 3.0.1.

ne-bbahn commented 7 months ago

I am having trouble reproducing this issue. I've disabled ipv6 on CentOS 9, RHEL 8 and Ubuntu 22 and installed NCPA, but it is still functioning without any issues on each of them. I've reimplemented the fix that seems to have been lost in the rewrite to NCPA 3, but I'm unable to reproduce the issue using 3.0.0.

pittagurneyi commented 7 months ago

I haven't tested it, but for me with the default config, if I don't set 0.0.0.0, then it only binds to IPv6 and not IPv4. In theory, according to the documentation, it should bind to both but doesn't.

In case IPv6 is disabled system-wide via sysctl it would then mean it only tries to bind to IPv6 and fails, thus exits. So something in the logic is wrong it seems?

pittagurneyi commented 7 months ago

I just checked and on the system where I had to set it to '0.0.0.0' I also have ipv6 disabled system-wide.

From what I could find it seems that if IPv6 were enabled, it would create a dual-stack socket in Linux, which would accept connections from IPv4 addresses as well. But as IPv6 is disabled, no dual-stack socket based on AF_INET6 is created. So it fails and no connections are accepted.

If I traced it correctly, then the problem is here: https://www.gevent.org/_modules/gevent/baseserver.html#BaseServer

def _parse_address(address):
    if isinstance(address, tuple):
        if not address[0] or ':' in address[0]:
            return _socket.AF_INET6, address
        return _socket.AF_INET, address

If ':' is in address, and you set '::' by default, then it defaults to ipv6 socket type. It doesn't create IPv4 socket + IPv6 socket, but it would create a dual-stack socket of type AF_INET6, which isn't possible on IPv6-disabled systems.

The only solution I see, if one were to automate it:

In NCPA check whether IPv6 is disabled system-wide and change the default address value from '::' to '0.0.0.0'.

I don't know if it is possible to modify the kernel in a way that net/ipv6 isn't available at all, so that if /proc/sys/net/ipv6 doesn't exist we'd have to assume that ipv6 is not compiled into the kernel at all.

Something like this?

if not os.path.exists('/proc/sys/net/ipv6'):
    ipv6_disabled = 1
elif os.path.exists('/proc/sys/net/ipv6/conf/all/disable_ipv6'):
    with open('/proc/sys/net/ipv6/conf/all/disable_ipv6') as f:
        ipv6_disabled = int(f.read())
if ipv6_disabled == 1:
    address = '0.0.0.0'

Should the user have ipv6 disabled via sysctl on a per-interface basis, then they'll just have to set the ip manually.

shodgson11 commented 7 months ago

Should the user have ipv6 disabled via sysctl on a per-interface basis, then they'll just have to set the ip manually.

Certainly my Ubuntu 20.04 system has IPv6 baked-in, but my interface does not have IPv6 info configured. I guess that means the fix here does not apply to my system. Failing to launch the service and requiring user intervention seems a bit excessive in the case where an interface doesn't have an IPv6 address.

ne-bbahn commented 7 months ago

This will be fixed in NCPA v3.0.1 so I have closed this issue.