freifunk-berlin / firmware

DEPRECATED: Build system for Berlin firmware. Please user the pinned falter-repos instead
https://berlin.freifunk.net
GNU General Public License v3.0
73 stars 34 forks source link

OLSR crashes but leaves an unkillable zombie process #540

Closed pmelange closed 5 years ago

pmelange commented 6 years ago

I have had the following happen to me three time since installing the firmware on an ERX-SFP (Hedy 1.0.0).

OLSR crashes and there is nothing I can do to restart it. If I run the neigh.sh script, it hangs. If I run /etc/init.d/olsrd restart, it hangs. If I kill -9 the PID, the process stays in the ps list.

The folowing kernel message is shown every 5 to 10 seconds:

kern.emerg kernel: [1054204.339802] unregister_netdevice: waiting for tnl_010b1f0a to become free. Usage count = 1

And when the OLSR watchdog runs, I get:


cron.info crond[1532]: USER root pid 26254 cmd /usr/sbin/ff_olsr_watchdog
daemon.err olsrd: /etc/init.d/olsrd: ERROR: there is already an instance of olsrd running (pid: '10264'), not starting.

Since killing the process doesn't work, the only other alternative is to reboot the device.
pmelange commented 6 years ago

There is a clear change in memory usage too.

screenshot_2018-04-02_00-52-13

SvenRoederer commented 6 years ago

as mentioned by the olsrd-maintainers, we should upgrade to olsr 0.9.6.1+

bobster-galore commented 6 years ago

... and u can't kill a zombie since it's already dead! U might kill everything above and around it, to let the zombie rest in piece.

pmelange commented 6 years ago

This has happened to me 4 more times since I wrote this issue. I am going to replace the ERX-SFP with a WDR4900 to see if the problem persists.

@bobster-galore silly

pmelange commented 5 years ago

This was only a problem with the ERX-SFP

SvenRoederer commented 5 years ago

what caused this on your ERX?

pmelange commented 5 years ago

Unknown. The ERX is not longer being used with openwrt. All the info I had I put in the original comment