ffnord / mesh-announce

Discussion at #mesh-announce:irc.hackint.org and (separately) at
https://matrix.to/#/!MjLIHcALOcENXZWQlH:irc.hackint.org/$1547640760901FmKaD:matrix.eclabs.de
13 stars 45 forks source link

UDP Memleak when running multiple parallel instances #52

Open margau opened 4 years ago

margau commented 4 years ago

Hello, our workaround for #49 is running multiple parallel instances of mesh-announce, one per domain:

python3 /opt/mesh-announce/respondd.py -d /opt/mesh-announce/providers -i dom2-br -i dom2-tp -b dom2-bat (Other Domains with the same scheme)

Basically it works, but unfortunately, after a few hours we have a mem-leak blocking all UDP communication on the system (including fastd): https://prometheus.ffm.freifunk.net/graph?g0.range_input=1h&g0.end_input=2020-01-09%2008%3A44&g0.expr=(node_sockstat_UDP_mem_bytes%7Binstance%3D~%22gw05.*%22%7D%20%2F%201024%20%2F%201024)%20%3E%20100&g0.tab=0

This doesn't happen if we have only one mesh-announce instance running.

As my knowledge with Python and the ThreadingUDPServer is very limited, I currently not sure how to debug this problem.

Python Version is Python 3.7.3 (default, Apr 3 2019, 05:39:12) on Stretch 5.4.0-1-amd64 #1 SMP Debian 5.4.6-1 (2019-12-27)

Anyone has some ideas? Thanks! margau

tackin commented 4 years ago

Same problem for me. Running 5 services of this fork only on our GW-erai ( https://github.com/freifunktrier/mesh-announce ) and the map ( http://maps.freifunk-trier.de/#/de/map#!/de/map/2661965025dc ) shows all kinds of stupid things with every respond-update.

I also tried all our net-segments with one service but failed as well.

AiyionPrime commented 4 years ago

@margau on what commit are you guys currently running? There is a nasty upstream issue with python udp-server and python 3.7 apparenty. It got hotfixed on master, just wanted to be sure you're not affected by that.

Please share the result of this command in your mesh-announce directory: grep -rni server.daemon_threads

margau commented 4 years ago

Currently running on 8deeec1a5dd1ab878fed9b0dc6202de8a76f9f2f

respondd.py:104: server.daemon_threads = True

Our workaround was to increase udp_mem in sysctl.