Open gh0st42 opened 8 years ago
mdp trace is a weird beast, designed to give some network diagnostic information even when the routing protocol is under active development and is completely broken. It doesn't work like you might expect ICMP to behave. It was written a long time ago and hasn't been touched since.
The packet grows as it travels. Each node searches the packet to see if it is already listed there. If it is, an attempt is made to forward the packet back to the hop listed prior to itself. Otherwise the packet is forwarded onwards to the next hop in the routing table. Unless this node doesn't know which way to send it. This way if a loop is discovered in the network, or part of the network graph only works one way, the packet might still make it back to the source.
We could massively increase the number of hops by making a couple of simple changes around here; https://github.com/servalproject/serval-dna/blob/development/overlay_mdp_services.c#L344
When we run out of buffer space; ignore, rollback & make sure the packet is sent back to the previous hop instead of onwards. That should double the maximum hop count, but you wont see the full backwards path.
Allow for sending SID abbreviations. Perhaps only when the packet is on the return path. Otherwise the protocol would be less useful for diagnosis.
Skip unknown SID's. We have to be careful here though. Being unable to decode our own SID could lead to a packet storm. We might need the capability to distinguish between; "no possible match", "ambiguous match", and "ambiguous but could be me".
Changing this is a low priority. But patches & test cases are welcome.
Okay, thanks for the quick response. That explains the behaviour, still not quite sure what the best fix would be but I agree on a low priority even though it makes debugging complex networks a bit more complicated. Since we are evaluating quite a few different network setups and discovered a few more bugs we might as well formalize these test and provide test cases/network setups for these once we're done with our evaluation.
I did some network tests with serval and discovered some odd behaviour. Details about the setup can be found in the following blog article: http://otg-living.blogspot.de/2016/01/hop-hop-hop-hop-stop.html
Basically I have 18 nodes running serval in one long chain. n1 ... n18 I can use servald mdp ping and reach every single one of the 18 nodes from n1. The problem starts when I try to trace the path.
When I try to reach n17 it all just works (16 hops):