CumulusNetworks / ifupdown2

GNU General Public License v2.0
158 stars 72 forks source link

Issue with use of Netlink interface cache containing incomplete information after creating new interface #310

Open blu-antony opened 1 month ago

blu-antony commented 1 month ago

ifupdown2 version: 3.3.0

Taking the following interface script as an example:

allow-usbeth0 usbeth0                     
iface usbeth0 inet manual
        ipv6-addrgen off                                                                                 
        mtu       1574   
        gro-offload off

allow-usbeth0 eth0.14                                                                                    
iface eth0.14 inet static
        ipv6-addrgen on   
        mtu       1574
        address   10.0.216.5/24
        vlan-raw-device usbeth0
        vlan-id 14
        post-up /sbin/ip route add 0.0.0.0/0 via 10.0.216.1 dev $IFACE
        pre-down /sbin/ip route del 0.0.0.0/0 via 10.0.216.1 dev $IFACE

where our systems are configured to disable IPv6 link local addressing by default (in systctl.conf):

# Enable IPv6 networking on all interfaces by default but disable LLA
# except on the loopback interface.

net.ipv6.conf.all.addr_gen_mode=1
net.ipv6.conf.default.addr_gen_mode=1
net.ipv6.conf.lo.addr_gen_mode=0

net.ipv6.conf.all.disable_ipv6=0
net.ipv6.conf.default.disable_ipv6=0
net.ipv6.conf.lo.disable_ipv6=0

we are occasionally seeing that the VLAN interface does not have IPv6 LLA enabled even though ipv6-addrgen on is set.

Note that to reproduce, the following commands are run repeatedly until the condition arises (since the condition does not always occur):

ifdown --allow usbeth0 ; ifup -v --allow usbeth0

After looking into this, the problem is caused by the Netlink RTM_NEWLINK event for the VLAN interface arriving after the cache has been interrogated by get_link_ipv6_addrgen_mode to determine the IPv6 mode, which defaults to 0 (since the cache is missing the information and a KeyError occurs):

    def get_link_ipv6_addrgen_mode(self, ifname):
        try:
            with self._cache_lock:
                return self._link_cache[ifname].attributes[Link.IFLA_AF_SPEC].value[socket.AF_INET6][Link.IFLA_INET6_ADDR_GEN_MODE]
        except (KeyError, AttributeError):
            # default to 0 (eui64)
            return 0
        except TypeError as e:
            return self.__handle_type_error(inspect.currentframe().f_code.co_name, ifname, str(e), return_value=0)

This default is incorrect in our setup. This is just one instance wher the contents of the cache may not reflect reality.

The underlying cause is the implementation choice to not wait for the RTM_NEWLINK event message when creating an interface and instead only waiting for the acknowledge, thereby creating this ambiguity (and race) in the cached link state information; from the comment in tx_nlpacket_get_response_with_error_and_cache_on_ack (in nlcache.py):

        # When creating a new link via netlink, we don't always wait for the kernel
        # NEWLINK notification to be cached to continue. If our request is ACKed by
        # the OS we assume that the link was successfully created. Since we aren't
        # waiting for the kernel notification to continue we need to manually fill
        # our cache with the packet we just TX'ed. Once the NEWLINK notification
        # is received it will simply override the previous entry.
        # We need to keep track of those manually cached packets. We set a private
        # flag on the objects via the attribute priv_flags

Note that I have tried replacing all uses of the tx_nlpacket_get_response_with_error_and_cache_on_ack method with the (currently unused) tx_nlpacket_get_response_with_error_and_wait_for_cache method and this looks like to have solved the cache coherency issue for us but I am not sure if this is a robust solution (with or without extending the 1 second timeout).

Is there better way to robustly solve this issue ?