Open koganfel opened 4 years ago
thanks! to be fixed asap
Thank you! Please ignore the /proc/net/psched warning - we run a somewhat abridged kernel, this feature is not included and it is fine.
Tested on 5.x — works. Trying 4.x
$ cat e0.py
from pyroute2 import IPRoute
with IPRoute() as ipr:
for r in ipr.fdb('dump'):
print(r.get_attr('NDA_LLADDR'))
$ python e0.py
01:00:5e:00:00:01
33:33:00:00:00:01
33:33:ff:25:90:08
33:33:ff:00:00:01
33:33:00:00:00:01
01:00:5e:00:00:01
33:33:ff:9f:a3:66
$ uname -a
Linux test03 4.19.53-mainline-rev1 #1 SMP Wed Jun 19 23:37:38 UTC 2019 aarch64 GNU/Linux
Downloading Fedora 25 to get the kernel 4.8
Meanwhile 4.9:
$ cat e0.py
from pyroute2 import IPRoute
with IPRoute() as ipr:
for r in ipr.fdb('dump'):
print(r.get_attr('NDA_LLADDR'))
$ python e0.py
01:00:5e:00:00:01
01:80:c2:00:00:21
33:33:00:00:00:01
33:33:ff:82:db:5f
02:d4:07:82:db:5f
33:33:00:00:00:01
33:33:00:00:00:01
$ uname -a
Linux test09 4.9.7-sunxi #1 SMP Thu Feb 2 01:52:06 CET 2017 armv7l GNU/Linux
Interesting. What pyroute2 version are you testing with?
The master head:
$ git describe
0.5.9-43-g2039c0c5
But the IPRoute.fdb()
code is quite old.
What about your version?
tag 0.5.9
OK, I'll try with the master head anyway.
will let you know shortly
[root@localhost pyroute2]# cat e0.py
from pyroute2 import IPRoute
with IPRoute() as ipr:
for r in ipr.fdb('dump'):
print(r.get_attr('NDA_LLADDR'))
[root@localhost pyroute2]# python e0.py
01:00:5e:00:00:01
33:33:00:00:00:01
33:33:ff:8f:d5:1b
[root@localhost pyroute2]# uname -a
Linux localhost.localdomain 4.8.6-300.fc25.x86_64 #1 SMP Tue Nov 1 12:36:38 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
[root@localhost pyroute2]# cat /etc/redhat-release
Fedora release 25 (Twenty Five)
Do you have a custom kernel?
Yes, based on 4.14.39.
Is there any chance to get a VM with your kernel to run the test?
Unfortunately, no, this is unlikely.
We dug in a bit and it looks like the problem was introduced by a regression caused by commit 5e6d243587990a588143b9da3974833649595587 (around release 3.17) addressed by commit 5f999abba33f6788f52cdadae3432f5e731c09bc (around 4.19). We have incorporated the fix in our custom kernels and that fixed the problem.
The bug is around rtnl_fdb_dump() function in net/core/rtnetlink.c. Looks like the new kernel expects ifinfmsg rather than ndmsg in case of fdb dump request.
We made the following change to iproute/linux.py that made it working with unpatched kernel as well, that illustrates the problem:
*** /usr/lib/python3.6/site-packages/pyroute2/iproute/linux.py 2020-03-11 11:07:54.000000000 -0400
--- /tmp/linux.py 2020-03-16 12:33:27.424000000 -0400
***************
*** 842,847 ****
--- 842,850 ----
# nud -> state
if 'nud' in kwarg:
kwarg['state'] = kwarg.pop('nud')
+ if command == 'dump':
+ return self.fdb_dump(**kwarg)
+
if (command in ('add', 'del', 'append')) and \
not (kwarg.get('state', 0) & ndmsg.states['noarp']):
# state must contain noarp in add / del / append
***************
*** 862,867 ****
--- 865,906 ----
#
# General low-level configuration methods
#
+ def fdb_dump(self, **kwarg):
+ '''
+ Specialized neighbours dump operation, same as `ip neigh` or `bridge fdb`
+
+ **dump**
+
+ Dump all the records in the NDB.
+ '''
+
+ if 'match' not in kwarg:
+ match = kwarg
+ else:
+ match = kwarg.pop('match', None)
+
+ flags = NLM_F_REQUEST | NLM_F_DUMP
+ command = RTM_GETNEIGH
+
+ msg = ifinfmsg()
+ # ifinfmsg fields
+ #
+ # ifi_family
+ # ifi_type
+ # ifi_index
+ # ifi_flags
+ # ifi_change
+ #
+ for field in msg.fields:
+ msg[field[0]] = kwarg.pop(field[0], 0)
+ msg['family'] = msg['family'] or AF_BRIDGE
+
+ ret = self.nlm_request(msg, msg_type=command, msg_flags=flags)
+ if match is not None:
+ return self._match(match, ret)
+ else:
+ return ret
+
def neigh(self, command, **kwarg):
'''
Neighbours operations, same as `ip neigh` or `bridge fdb`
Thanks a lot! Testing
Hi! First thing, thanks a lot for the awesome piece of software! I enjoy using it a lot.
Unfortunately, I've hit a snag. The call to fdb('dump') returns empty answer, like this:
`python3 Python 3.6.8 (default, Aug 7 2019, 17:28:10) [GCC 4.8.5 20150623 (Red Hat 4.8.5-39)] on linux Type "help", "copyright", "credits" or "license" for more information.
bridge fdb show 01:00:5e:00:00:01 dev ens5 self permanent 00:00:00:00:00:00 dev mcon0 dst 10.145.0.11 self permanent e6:be:9a:b2:9b:4d dev mcon0 dst 10.145.0.11 self`
Python code works fine on a set of standard centos7 3.10 kernels but breaks on 4.x (I tested with 4.4.x and 4.14.x varieties). Looks like it is related to https://elixir.bootlin.com/linux/v5.5/source/net/core/rtnetlink.c#L4167 (read the comment). Unfortunately I have no time to fix it myself now but decided to notify yo and the community - maybe someone else will.
Thanks again!