svinota / pyroute2

Python Netlink and PF_ROUTE library — network configuration and monitoring
https://pyroute2.org/
Other
959 stars 248 forks source link

Performance of `IPRoute.get_addr` #1080

Open tomkcook opened 1 year ago

tomkcook commented 1 year ago

We've been looking into the performance of this code:

import pyroute2
import cProfile
ipr = pyroute2.IPRoute()
cProfile.run("ipr.get_addr(ifindex=29)")

Substitute the ifindex for an interface with a few addresses on it on your system. In my case:

$ ip address list qca-ap11
29: qca-ap11: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default 
    link/ether 00:19:70:c3:aa:3c brd ff:ff:ff:ff:ff:ff
    inet 10.101.55.21/24 scope global qca-ap11
       valid_lft forever preferred_lft forever
    inet6 fe80::219:70ff:fec3:aa3c/64 scope link 
       valid_lft forever preferred_lft forever

Among the surprising (to me at least) results of the profiling are that nlmsg_base.__init__ is called 188 times, list.append is called 285 times and isinstance is called 286. get_addr takes 34ms overall to run.

I'm running pyroute2 0.5.3 on Python 3.6, arm64v8 and Linux 4.14. I'm aware this is rather old, so if this has been addressed in newer versions of pyroute2 then please do trout me and move on. I don't see any significant performance difference in this code between 0.5.3 and 0.7.5 on x86 but it's not straightforward to upgrade the package version on my target system. On x86, if anything, it's gotten slightly worse between those two versions.

ffourcot commented 1 year ago

Hello,

get_addr will run a python filtering on all configured addresses, so it can be slow. Especially if you have a lot of addresses on the system.

If you want to use kernel side filtering, you can do something like this :

In [2]: ipr = pyroute2.IPRoute()

In [3]: ipr.get_addr(index=1)[0].get_attr("IFA_ADDRESS")
Out[3]: '127.0.0.1'

In [4]: %timeit ipr.get_addr(index=1)[0].get_attr("IFA_ADDRESS")
276 ms ± 11.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [15]: ipr = pyroute2.IPRoute(strict_check=True)

In [16]: %timeit ipr.addr("dump", index=1, family=socket.AF_INET)
977 µs ± 76.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [17]: ipr.addr("dump", index=1, family=socket.AF_INET)[0].get_attr("IFA_ADDRESS")
Out[17]: '127.0.0.1'

WARNING: strict_check is mandatory here. Without it, the kernel will ignore given filters.

svinota commented 1 year ago

Additionally, the latest versions support custom parsers, so you can avoid using the generic stuff that recursively decode NLA messages, and write a simple routine that takes only the data you need, as it's done here:

https://github.com/svinota/pyroute2/blob/e96f3ad950e69d2d65ea21d5c030c04ee31aed83/pyroute2/iproute/linux.py#L755-L760

https://github.com/svinota/pyroute2/blob/e96f3ad950e69d2d65ea21d5c030c04ee31aed83/pyroute2/iproute/parsers.py#L34-L46

tomkcook commented 1 year ago

Thanks, that's really helpful. Unfortunately the device we're targeting has a 4.14 kernel that doesn't support kernel-side filtering, but the custom parser is really useful.

Using this, I've written this version of get_addr():

import ipaddress
import pathlib
import socket
import struct
from typing import List, Union

import pyroute2
from pyroute2 import netlink
from pyroute2.netlink import rtnl
from pyroute2.netlink.rtnl import rtmsg

IFA_ADDRESS = 1
IFA_LOCAL = 2

def ifc_address_parser(family: int, index: int) -> 'netlink.nlmsg':
    """
    Generate a netlink parser that returns IP addresses in the given family and
    for the interface with the given index.
    """
    @profile
    def parser(data, offset, length):
        length, type, flags, sequence_number =  struct.unpack_from('IHHI', data, offset)
        header=dict(
            length=length, type=type, flags=flags, sequence_number=sequence_number, error=None
        )

        if type == netlink.NLMSG_DONE:
            msg = netlink.nlmsg()
            msg['header'] = header
            msg.length = length
            return msg
        if type != rtnl.RTM_NEWADDR:
            return None

        ifc_index = struct.unpack_from("I", data, offset + 20)[0]
        if ifc_index != index:
            return None

        ifa_family, ifa_prefixlen = struct.unpack_from("BB", data, offset + 16)
        if family and ifa_family != family:
            return None

        cursor = offset + 24 # offset + sizeof(struct header) + sizeof(struct ifaddrmsg)

        while cursor < offset + length:
            nla_length, nla_type = struct.unpack_from('HH', data, cursor)
            if nla_type in [IFA_ADDRESS, IFA_LOCAL]:
                addr = data[cursor+4:cursor+nla_length]
                msg = netlink.nlmsg()
                msg['header'] = header
                msg.length = length
                if ifa_family == socket.AF_INET:
                    ipaddr = (int(b) for b in addr)
                    ipaddr = '.'.join(str(x) for x in ipaddr)
                    msg['ADDRESS'] = ipaddr
                    msg['PREFIXLEN'] = ifa_prefixlen
                else:
                    ipaddr = ipaddress.IPv6Interface((addr, ifa_prefixlen))
                    msg["ADDRESS"] = ipaddr
                return msg

            nla_length = (nla_length + 3) & ~3
            cursor += nla_length
        return None
    return parser

def get_addr(ipr: pyroute2.IPRoute, family: int, index: int) -> List[Union[ipaddress.IPv4Address, ipaddress.IPv6Address]]:
    msg = rtmsg.rtmsg()
    if family:
        msg['family'] = family
    msg['index'] = 10
    addresses = ipr.nlm_request(
        msg,
        msg_type=rtnl.RTM_GETADDR,
        msg_flags=netlink.NLM_F_DUMP | netlink.NLM_F_REQUEST,
        parser=ifc_address_parser(family, index)
    )

    return [address["ADDRESS"] for address in addresses]

def interface_index(interface):
    index_file = pathlib.Path("/sys/class/net") / interface / "ifindex"
    if not index_file.exists():
        return None
    return int(index_file.read_text(encoding="utf-8"))

ipr = pyroute2.IPRoute()
get_addr(ipr, socket.AF_INET, interface_index("wlo1"))

(delete the @profile if you're not using line_profiler). Performance here is a bit platform-dependent, but the speed-up compared to IPRoute.get_addr() is about 60% on x86/Python3.10 and >80% on aarch64/Python3.6 (the target platform here).

Unfortunately I can't use line_profiler on the target platform (or not easily, at any rate) but on x86, it's now spending more than 1/4 of its time in the constructor of netlink.nlmsg(). I haven't had a chance to look into why that's an expensive operation, though my guess would be that dictionary constructors are part of it.

Another thing that would be really useful in this area is if IPDB.register_callback() had an option not to parse messages but to return the raw message buffer, so that similar optimised parsing routines could be used. Would you be open to a PR that does this? (I know it seems an edge case; we have a wifi driver that sends channel utilisation statistics in RTM_NEWLINK messages, one per second per radio and it gets expensive - yes, it'd be nice to fix the driver but they're qualcomm and we're not).

svinota commented 1 year ago

Thanks for the testcase, @tomkcook ! Maybe I can run performance measurements on a platform like Raspberry PI, I believe it should be comparable.

But I have to tell that the library was designed to be a universal netlink solution, thus there are not so many RTNL-specific optimizations :) yet, so little by little we fix that.

Regarding IPDB -- could you pls tell me the reasons why NDB is not an option? Whatever NDB is missing, I think that adding this functionality will be for me a much more easy task than fixing IPDB, which is broken by design. IPDB is excluded now from the CI pipeline, and I planned to drop it one day.

tomkcook commented 1 year ago

I don't recall the thinking that went into using IPDB here, though it was admittedly probably mostly me that wrote the code. We're working on a fairly constrained embedded platform and I think NDB looked too big and complex. We only use IPDB to listen for events, to detect when interfaces go up and down.

And just to give you a warning, I left a system running with the above version of get_addr() over the weekend and this morning it was locked hard and had to be power cycled. I haven't looked into it, but it seems likely it's leaking memory somewhere.

svinota commented 1 year ago

Thanks for the info!

svinota commented 1 year ago

@tomkcook could I please ask you to spend a time and write down the requirements to a possible IPDB replacement in your case? Like what are you using right now, multi/single threaded, constraints etc?

If a simple drop-in replacement could work for you, I would rather provide it, so an occasional drop of IPDB will not affect your project.

PS: regarding memory leaks -- tracing the code.

tomkcook commented 1 year ago

Having spent today thinking about it, I've decided to drop IPDB and roll my own netlink socket code to listen for interface up/down events and parse out the bare minimum I need to identify the interface and its state. I don't think our use case is general enough to put a requirement on something like pyroute2.