Forwarding a packet back to its link layer source should not be allowed

jnohlgard commented 8 years ago

Network scenario: BR -> router -> leaf

BR knows that leaf is behind router, however for some reason the leaf node goes missing and router drops it from its fib (e.g. by router reboot, or clock drift between BR and router so that router drops the route before BR does).

Now BR will forward packets to leaf via router, and since router does not have leaf in its FIB it will pass the received packet along its default route, which is BR. Now the packet starts bouncing until TTL is reached. This will lead to excessive power drain and occupying wireless bandwidth for no good reason.

IMO there should be a check somewhere in the routing decisions that drops a packet instead of forwarding if the routing decision is to forward it to the same link layer address from where it was received.

jnohlgard commented 8 years ago

I don't know where the proper place for such a check would be in the source, any pointers?

miri64 commented 8 years ago

Mhhh… I'm not sure that is correct behavior. Isn't that what the hop limit is supposed to be for (though yes, it is power draining)? @emmanuelsearch @OlegHahm @tcschmidt what do you say?

BytesGalore commented 8 years ago

Now BR will forward packets to leaf via router, and since router does not have leaf in its FIB it will pass the received packet along its default route, which is BR.

What you describe is a classic routing-loop which a routing-protocoll takes care of and prevents/resolves it. If this situation happens, the routing-protocol will eventually resolve it, if not then the protocol would be somehow broken. I'm with @authmillenon, we should not manually tweak something to prevent routing-loops, this should happen by design.

tcschmidt commented 8 years ago

Some preliminary remarks:

@gebart : you want to prevent routing loops by excluding the previous last hop (link-local) from forwarding: this is a bit dangerous (e.g., during routing convergence) and won't really help, as routing loops can occur via several hops and addresses may be global unicast.

For the purpose of a lost leaf, we have ICMP "destination unreachable/no route to host". This would fire in case of a lost route.

If there is a valid default route, though, topology needs maintenance. A routing protocol (that operates with default routes) needs to update its information state about connectivity. In case a link is lost, this should trigger an update. In the case of RPL, there is the additional information about upward/downward, which should not mix.

However, there are routing protocols (like AODV) that do not procure for default routes. If default routes are implemented by other means, there may be conflicts ... some need for additional thinking here.

jnohlgard commented 8 years ago

@tcschmidt I observed this happening between two RIOT nodes on the wireless 6lowpan network (using a sniffer) and there were never any ICMPv6 no route to host or any other errors reported, only that the packet bounced 64 times before the hoplimit killed it. Is this a bug in the RPL implementation then?

miri64 commented 8 years ago

Error reporting isn't realized in GNRC yet. However, with https://github.com/RIOT-OS/RIOT/pull/3544 this should be a breeze by just replacing every gnrc_pktbuf_release() [edit: in IPv6] with the appropriate error send function.

tcschmidt commented 8 years ago

It sure is an inconsistent routing state ... and RPL knows about default routes. I would put RPL under suspect, but details under further investigations. There is of course always a convergence time required and inconsistencies may be "correct" during convergence.

On 12.03.2016 14:17, Joakim Nohlgård wrote:

@tcschmidt https://github.com/tcschmidt I observed this happening between two RIOT nodes on the wireless 6lowpan network (using a sniffer) and there were never any ICMPv6 no route to host or any other errors reported, only that the packet bounced 64 times before the hoplimit killed it. Is this a bug in the RPL implementation then?

— Reply to this email directly or view it on GitHub https://github.com/RIOT-OS/RIOT/issues/5051#issuecomment-195739963.

tcschmidt commented 8 years ago

@gebart I checked back with the RPL spec. In general, RPL is not loop-free and does not provide rapid convergence. However, there are simple loop avoidance mechanisms (accounting for tree traversal directions), see https://tools.ietf.org/html/rfc6550#section-11.2 I'm not sure whether loop avoidance is implemented - @cgundogan ?

kaspar030 commented 8 years ago

The problem here is that the 6lowpan RFC is very clear on what is supposed to happen if there's no neighbor cache entry: forward the packet to any router available. For a 6lowpan node, that makes sense. For a multihomed node (e.g., the 6lbr) it doesn't. The RFC says "send the packet to any available router". Sending it back to router the packet came from is clearly wrong. IMHO this is a case where the RFC is unclear. For multihomed nodes, a 6lo node should send it to any router if it has only the 6lowpan network, a 6lbr should either try to send it directly to the target's l2 address or drop the packet with "no route to host".

I've implemented that in [1].

[1] https://github.com/kaspar030/RIOT/tree/introduce_new_ipv6_forwarding

jnohlgard commented 8 years ago

There is actually a mention of the exact behaviour in RFC6550 (RPL) page 100

Note that the chosen successor MUST NOT be the neighbor that was the predecessor of the packet (split horizon), except in the case where it is intended for the packet to change from an Upward to a Downward direction, as determined by the routing table of the node making the change, such as switching from DIO routes to DAO routes as the destination is neared in order to continue traveling toward the destination.

Split horizon: https://en.wikipedia.org/wiki/Split_horizon_route_advertisement

tcschmidt commented 8 years ago

That is actually a bit strange, because split horizon is typically a method for route propagation (and the paragraph refers to DIO/DAO changes). Normally, this behaviour is part of the routing daemon and not part of the regular packet forwarding (in particular, as regular packets don't see the last hop on the IP-level). Also, the Section 11.1 is called "Suggestions for Packet Forwarding" - I guess, this requires more digging into the details.

On 12.03.2016 16:01, Joakim Nohlgård wrote:

There is actually a mention of the exact behaviour in RFC6550 (RPL) page 100 <tools.ietf.org/html/rfc6550#page-100>
Note that the chosen successor MUST NOT be the neighbor that was the
predecessor of the packet (split horizon), except in the case where
it is intended for the packet to change from an Upward to a Downward
direction, as determined by the routing table of the node making the
change, such as switching from DIO routes to DAO routes as the
destination is neared in order to continue traveling toward the
destination.
Split horizon: https://en.wikipedia.org/wiki/Split_horizon_route_advertisement

— Reply to this email directly or view it on GitHub https://github.com/RIOT-OS/RIOT/issues/5051#issuecomment-195753547.

tcschmidt commented 8 years ago

I don't see the relation here: the problem of @gebart seems neither related to 6lbr (only to multi-hop routing), nor to neighbor caches. It is a L3 forwarding decision (use default in the absence of a more specific) - at inconsistent routing tables.

On 12.03.2016 15:41, Kaspar Schleiser wrote:

The problem here is that the 6lowpan RFC is very clear on what is supposed to happen if there's no neighbor cache entry: forward the packet to /any/ router available. For a 6lowpan node, that makes sense. For a multihomed node (e.g., the 6lbr) it doesn't. The RFC says "send the packet to any available router. Sending it back to router the packet came from is clearly wrong. IMHO this is a case where the RFC is unclear. For multihomed nodes, a 6lo node should send it to /any/ router if it has only the 6lowpan network, a 6lbr should either try to send it directly to the target's l2 address or drop the packet with "no route to host".

I've implemented that in [1].

[1] https://github.com/kaspar030/RIOT/tree/introduce_new_ipv6_forwarding

— Reply to this email directly or view it on GitHub https://github.com/RIOT-OS/RIOT/issues/5051#issuecomment-195752319.

haukepetersen commented 8 years ago

won't be fixed for this release -> postponed

kaspar030 commented 8 years ago

won't be fixed for this release -> postponed

kYc0o commented 6 years ago

This is not part of a milestone anymore but I'd ask @miri64 if this issue was somehow addressed by recent networking PRs.

miri64 commented 6 years ago

This might have been fixed with the recent changes to RPL. @cgundogan is it ?

jia200x commented 5 years ago

does someone know if this is still an issue?

miri64 commented 5 years ago

@cgundogan ?

stale[bot] commented 5 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. If you want me to ignore this issue, please mark it with the "State: don't stale" label. Thank you for your contributions.

maribu commented 2 years ago

Ping?

RIOT-OS / RIOT

Forwarding a packet back to its link layer source should not be allowed #5051