svinota / pyroute2

Python Netlink and PF_ROUTE library — network configuration and monitoring
https://pyroute2.org/
Other
958 stars 248 forks source link

pyroute2.NDB() only hangs on debian 11 with huge number routes #992

Open fqucuo opened 2 years ago

fqucuo commented 2 years ago

OS: debian 11, Upgraded to latest uname -a: Linux debian 5.10.0-16-amd64 #1 SMP Debian 5.10.127-2 (2022-07-23) x86_64 GNU/Linux Python: 3.9.2 pyroute2: 0.7.2 Docker containers: 60+ More than 900+ route records(ip route show table all)

Steps to reproduce and exceptions

root@debian:~# python3
Python 3.9.2 (default, Feb 28 2021, 17:03:44) 
[GCC 10.2.1 20210110] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pyroute2
>>> ndb = pyroute2.NDB()  # <- Hangs
^CTraceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.9/dist-packages/pyroute2/ndb/main.py", line 504, in __init__
    self._dbm_ready.wait()
  File "/usr/lib/python3.9/threading.py", line 574, in wait
    signaled = self._cond.wait(timeout)
  File "/usr/lib/python3.9/threading.py", line 312, in wait
    waiter.acquire()
KeyboardInterrupt

YES, I have a very huge number routes in host, every host run more than 60+ docker containers, but only hangs on debian 11.

fqucuo commented 2 years ago

ndb = pyroute2.NDB(log='debug')

image

svinota commented 2 years ago

900+ is not a number, I test it on 50k+ routes, so it must work

if doesn't — it's clearly a bug, and thanks for the report

as I see from the logs, the error is related to the netns monitoring, I'm to investigate & fix it tonight

svinota commented 2 years ago

@fqucuo a fix is under way: https://github.com/svinota/pyroute2/pull/994

I can not reproduce the error yet, so it takes some time to fix everything properly.

Meanwhile you can switch off the netns manager source by:

ndb = NDB(log='debug', sources=[{'target': 'localhost'}])
svinota commented 2 years ago

A fix that helps to mitigate the issue is merged into the master, but it doesn't fix the root cause. The work on that will be continued.

fqucuo commented 2 years ago

@svinota got it, thanks.

bodik commented 4 months ago

same here

workaround seem to work