servalproject / serval-dna

The Serval Project's core daemon that implements Distributed Numbering Architecture (DNA), MDP, VoMP, Rhizome, MeshMS, etc.
http://servalproject.org
Other
171 stars 80 forks source link

servald becomes nonresponsive for upto several minutes due to long running alarm/callback functions #46

Closed gardners closed 11 years ago

gardners commented 11 years ago

Noticed on the very slow WR703N OpenWRT boxes. Difficult to debug there, because I haven't figured out how to build servald with full debugging, so that there is a properly useful backtrace.

gardners commented 11 years ago

The main culprit is an alarm that does not have a function name associated with it. If it did, it would be much easier to track down.

lakeman commented 11 years ago

Add a warning in _schedule, which already has a whence to log the caller location. https://github.com/servalproject/serval-dna/blob/development/fdqueue.c#L87

gardners commented 11 years ago

A sterling idea. It would also be wonderful if I could compile servald with full debugging on openwrt, but this seems non-trivial. I did some digging around trying to find out how to do this, and discovered https://forum.openwrt.org/viewtopic.php?id=40147 which confused me for a moment, because it the package that they were trying to compile with debugging was ... servald ;) Anyway, off to add the warning...

gardners commented 11 years ago

Issue #49 is perhaps the primary cause of this slowness. Currently being attacked. The work of that issue may also relieve the slowness in writing to/reading from blobs which is currently the cause of the most significant slowness apart from the creation of blobs.

gardners commented 11 years ago

Seems to be largely solved now, in so far as the OpenWRT WR703Ns don't hang anymore, even when writing/reading lots.

gardners commented 11 years ago

Closing as quick testing reveals that servald remains sufficiently responsive now.