Open jnohlgard opened 8 years ago
I don't know where the proper place for such a check would be in the source, any pointers?
Mhhh… I'm not sure that is correct behavior. Isn't that what the hop limit is supposed to be for (though yes, it is power draining)? @emmanuelsearch @OlegHahm @tcschmidt what do you say?
Now BR will forward packets to leaf via router, and since router does not have leaf in its FIB it will pass the received packet along its default route, which is BR.
What you describe is a classic routing-loop which a routing-protocoll takes care of and prevents/resolves it. If this situation happens, the routing-protocol will eventually resolve it, if not then the protocol would be somehow broken. I'm with @authmillenon, we should not manually tweak something to prevent routing-loops, this should happen by design.
Some preliminary remarks:
@gebart : you want to prevent routing loops by excluding the previous last hop (link-local) from forwarding: this is a bit dangerous (e.g., during routing convergence) and won't really help, as routing loops can occur via several hops and addresses may be global unicast.
For the purpose of a lost leaf, we have ICMP "destination unreachable/no route to host". This would fire in case of a lost route.
If there is a valid default route, though, topology needs maintenance. A routing protocol (that operates with default routes) needs to update its information state about connectivity. In case a link is lost, this should trigger an update. In the case of RPL, there is the additional information about upward/downward, which should not mix.
However, there are routing protocols (like AODV) that do not procure for default routes. If default routes are implemented by other means, there may be conflicts ... some need for additional thinking here.
@tcschmidt I observed this happening between two RIOT nodes on the wireless 6lowpan network (using a sniffer) and there were never any ICMPv6 no route to host or any other errors reported, only that the packet bounced 64 times before the hoplimit killed it. Is this a bug in the RPL implementation then?
Error reporting isn't realized in GNRC yet. However, with https://github.com/RIOT-OS/RIOT/pull/3544 this should be a breeze by just replacing every gnrc_pktbuf_release()
[edit: in IPv6] with the appropriate error send function.
It sure is an inconsistent routing state ... and RPL knows about default routes. I would put RPL under suspect, but details under further investigations. There is of course always a convergence time required and inconsistencies may be "correct" during convergence.
On 12.03.2016 14:17, Joakim Nohlgård wrote:
@tcschmidt https://github.com/tcschmidt I observed this happening between two RIOT nodes on the wireless 6lowpan network (using a sniffer) and there were never any ICMPv6 no route to host or any other errors reported, only that the packet bounced 64 times before the hoplimit killed it. Is this a bug in the RPL implementation then?
— Reply to this email directly or view it on GitHub https://github.com/RIOT-OS/RIOT/issues/5051#issuecomment-195739963.
@gebart I checked back with the RPL spec. In general, RPL is not loop-free and does not provide rapid convergence. However, there are simple loop avoidance mechanisms (accounting for tree traversal directions), see https://tools.ietf.org/html/rfc6550#section-11.2 I'm not sure whether loop avoidance is implemented - @cgundogan ?
The problem here is that the 6lowpan RFC is very clear on what is supposed to happen if there's no neighbor cache entry: forward the packet to any router available. For a 6lowpan node, that makes sense. For a multihomed node (e.g., the 6lbr) it doesn't. The RFC says "send the packet to any available router". Sending it back to router the packet came from is clearly wrong. IMHO this is a case where the RFC is unclear. For multihomed nodes, a 6lo node should send it to any router if it has only the 6lowpan network, a 6lbr should either try to send it directly to the target's l2 address or drop the packet with "no route to host".
I've implemented that in [1].
[1] https://github.com/kaspar030/RIOT/tree/introduce_new_ipv6_forwarding
There is actually a mention of the exact behaviour in RFC6550 (RPL) page 100
Note that the chosen successor MUST NOT be the neighbor that was the predecessor of the packet (split horizon), except in the case where it is intended for the packet to change from an Upward to a Downward direction, as determined by the routing table of the node making the change, such as switching from DIO routes to DAO routes as the destination is neared in order to continue traveling toward the destination.
Split horizon: https://en.wikipedia.org/wiki/Split_horizon_route_advertisement
That is actually a bit strange, because split horizon is typically a method for route propagation (and the paragraph refers to DIO/DAO changes). Normally, this behaviour is part of the routing daemon and not part of the regular packet forwarding (in particular, as regular packets don't see the last hop on the IP-level). Also, the Section 11.1 is called "Suggestions for Packet Forwarding" - I guess, this requires more digging into the details.
On 12.03.2016 16:01, Joakim Nohlgård wrote:
There is actually a mention of the exact behaviour in RFC6550 (RPL) page 100 <tools.ietf.org/html/rfc6550#page-100>
Note that the chosen successor MUST NOT be the neighbor that was the predecessor of the packet (split horizon), except in the case where it is intended for the packet to change from an Upward to a Downward direction, as determined by the routing table of the node making the change, such as switching from DIO routes to DAO routes as the destination is neared in order to continue traveling toward the destination.
Split horizon: https://en.wikipedia.org/wiki/Split_horizon_route_advertisement
— Reply to this email directly or view it on GitHub https://github.com/RIOT-OS/RIOT/issues/5051#issuecomment-195753547.
I don't see the relation here: the problem of @gebart seems neither related to 6lbr (only to multi-hop routing), nor to neighbor caches. It is a L3 forwarding decision (use default in the absence of a more specific) - at inconsistent routing tables.
On 12.03.2016 15:41, Kaspar Schleiser wrote:
The problem here is that the 6lowpan RFC is very clear on what is supposed to happen if there's no neighbor cache entry: forward the packet to /any/ router available. For a 6lowpan node, that makes sense. For a multihomed node (e.g., the 6lbr) it doesn't. The RFC says "send the packet to any available router. Sending it back to router the packet came from is clearly wrong. IMHO this is a case where the RFC is unclear. For multihomed nodes, a 6lo node should send it to /any/ router if it has only the 6lowpan network, a 6lbr should either try to send it directly to the target's l2 address or drop the packet with "no route to host".
I've implemented that in [1].
[1] https://github.com/kaspar030/RIOT/tree/introduce_new_ipv6_forwarding
— Reply to this email directly or view it on GitHub https://github.com/RIOT-OS/RIOT/issues/5051#issuecomment-195752319.
won't be fixed for this release -> postponed
won't be fixed for this release -> postponed
This is not part of a milestone anymore but I'd ask @miri64 if this issue was somehow addressed by recent networking PRs.
This might have been fixed with the recent changes to RPL. @cgundogan is it ?
does someone know if this is still an issue?
@cgundogan ?
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. If you want me to ignore this issue, please mark it with the "State: don't stale" label. Thank you for your contributions.
Ping?
Network scenario: BR -> router -> leaf
BR knows that leaf is behind router, however for some reason the leaf node goes missing and router drops it from its fib (e.g. by router reboot, or clock drift between BR and router so that router drops the route before BR does).
Now BR will forward packets to leaf via router, and since router does not have leaf in its FIB it will pass the received packet along its default route, which is BR. Now the packet starts bouncing until TTL is reached. This will lead to excessive power drain and occupying wireless bandwidth for no good reason.
IMO there should be a check somewhere in the routing decisions that drops a packet instead of forwarding if the routing decision is to forward it to the same link layer address from where it was received.