svinota / pyroute2

Python Netlink and PF_ROUTE library — network configuration and monitoring
https://pyroute2.org/
Other
960 stars 248 forks source link

fdb dump returns empty answer on 4.x kernels #686

Open koganfel opened 4 years ago

koganfel commented 4 years ago

Hi! First thing, thanks a lot for the awesome piece of software! I enjoy using it a lot.

Unfortunately, I've hit a snag. The call to fdb('dump') returns empty answer, like this:

`python3 Python 3.6.8 (default, Aug 7 2019, 17:28:10) [GCC 4.8.5 20150623 (Red Hat 4.8.5-39)] on linux Type "help", "copyright", "credits" or "license" for more information.

from pyroute2 import IPRoute ip = IPRoute() tcmsg: [Errno 2] No such file or directory: '/proc/net/psched' the tc subsystem functionality is limited ip.fdb('dump') []

bridge fdb show 01:00:5e:00:00:01 dev ens5 self permanent 00:00:00:00:00:00 dev mcon0 dst 10.145.0.11 self permanent e6:be:9a:b2:9b:4d dev mcon0 dst 10.145.0.11 self`

Python code works fine on a set of standard centos7 3.10 kernels but breaks on 4.x (I tested with 4.4.x and 4.14.x varieties). Looks like it is related to https://elixir.bootlin.com/linux/v5.5/source/net/core/rtnetlink.c#L4167 (read the comment). Unfortunately I have no time to fix it myself now but decided to notify yo and the community - maybe someone else will.

Thanks again!

svinota commented 4 years ago

thanks! to be fixed asap

koganfel commented 4 years ago

Thank you! Please ignore the /proc/net/psched warning - we run a somewhat abridged kernel, this feature is not included and it is fine.

svinota commented 4 years ago

Tested on 5.x — works. Trying 4.x

svinota commented 4 years ago
$ cat e0.py 
from pyroute2 import IPRoute

with IPRoute() as ipr:
    for r in ipr.fdb('dump'):
        print(r.get_attr('NDA_LLADDR'))

$ python e0.py 
01:00:5e:00:00:01
33:33:00:00:00:01
33:33:ff:25:90:08
33:33:ff:00:00:01
33:33:00:00:00:01
01:00:5e:00:00:01
33:33:ff:9f:a3:66

$ uname -a
Linux test03 4.19.53-mainline-rev1 #1 SMP Wed Jun 19 23:37:38 UTC 2019 aarch64 GNU/Linux
svinota commented 4 years ago

Downloading Fedora 25 to get the kernel 4.8

svinota commented 4 years ago

Meanwhile 4.9:

$ cat e0.py 
from pyroute2 import IPRoute

with IPRoute() as ipr:
    for r in ipr.fdb('dump'):
        print(r.get_attr('NDA_LLADDR'))

$ python e0.py 
01:00:5e:00:00:01
01:80:c2:00:00:21
33:33:00:00:00:01
33:33:ff:82:db:5f
02:d4:07:82:db:5f
33:33:00:00:00:01
33:33:00:00:00:01

$ uname -a
Linux test09 4.9.7-sunxi #1 SMP Thu Feb 2 01:52:06 CET 2017 armv7l GNU/Linux
koganfel commented 4 years ago

Interesting. What pyroute2 version are you testing with?

svinota commented 4 years ago

The master head:

$ git describe
0.5.9-43-g2039c0c5

But the IPRoute.fdb() code is quite old.

What about your version?

koganfel commented 4 years ago

tag 0.5.9

koganfel commented 4 years ago

OK, I'll try with the master head anyway.

koganfel commented 4 years ago

will let you know shortly

svinota commented 4 years ago
[root@localhost pyroute2]# cat e0.py 
from pyroute2 import IPRoute

with IPRoute() as ipr:
    for r in ipr.fdb('dump'):
        print(r.get_attr('NDA_LLADDR'))

[root@localhost pyroute2]# python e0.py 
01:00:5e:00:00:01
33:33:00:00:00:01
33:33:ff:8f:d5:1b

[root@localhost pyroute2]# uname -a
Linux localhost.localdomain 4.8.6-300.fc25.x86_64 #1 SMP Tue Nov 1 12:36:38 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

[root@localhost pyroute2]# cat /etc/redhat-release 
Fedora release 25 (Twenty Five)
svinota commented 4 years ago

Do you have a custom kernel?

koganfel commented 4 years ago

Yes, based on 4.14.39.

svinota commented 4 years ago

Is there any chance to get a VM with your kernel to run the test?

koganfel commented 4 years ago

Unfortunately, no, this is unlikely.

We dug in a bit and it looks like the problem was introduced by a regression caused by commit 5e6d243587990a588143b9da3974833649595587 (around release 3.17) addressed by commit 5f999abba33f6788f52cdadae3432f5e731c09bc (around 4.19). We have incorporated the fix in our custom kernels and that fixed the problem.

The bug is around rtnl_fdb_dump() function in net/core/rtnetlink.c. Looks like the new kernel expects ifinfmsg rather than ndmsg in case of fdb dump request.

We made the following change to iproute/linux.py that made it working with unpatched kernel as well, that illustrates the problem:

*** /usr/lib/python3.6/site-packages/pyroute2/iproute/linux.py  2020-03-11 11:07:54.000000000 -0400
--- /tmp/linux.py   2020-03-16 12:33:27.424000000 -0400
***************
*** 842,847 ****
--- 842,850 ----
          # nud -> state
          if 'nud' in kwarg:
              kwarg['state'] = kwarg.pop('nud')
+         if command == 'dump':
+             return self.fdb_dump(**kwarg)
+ 
          if (command in ('add', 'del', 'append')) and \
                  not (kwarg.get('state', 0) & ndmsg.states['noarp']):
              # state must contain noarp in add / del / append
***************
*** 862,867 ****
--- 865,906 ----
      #
      # General low-level configuration methods
      #
+     def fdb_dump(self, **kwarg):
+         '''
+         Specialized neighbours dump operation, same as `ip neigh` or `bridge fdb`
+ 
+         **dump**
+ 
+         Dump all the records in the NDB.
+         '''
+ 
+         if 'match' not in kwarg:
+             match = kwarg
+         else:
+             match = kwarg.pop('match', None)
+ 
+         flags = NLM_F_REQUEST | NLM_F_DUMP
+         command = RTM_GETNEIGH
+ 
+         msg = ifinfmsg()
+         # ifinfmsg fields
+         #
+         # ifi_family
+         # ifi_type
+         # ifi_index
+         # ifi_flags
+         # ifi_change
+         #
+         for field in msg.fields:
+             msg[field[0]] = kwarg.pop(field[0], 0)
+         msg['family'] = msg['family'] or AF_BRIDGE
+ 
+         ret = self.nlm_request(msg, msg_type=command, msg_flags=flags)
+         if match is not None:
+             return self._match(match, ret)
+         else:
+             return ret
+ 
      def neigh(self, command, **kwarg):
          '''
          Neighbours operations, same as `ip neigh` or `bridge fdb`
svinota commented 4 years ago

Thanks a lot! Testing