weaveworks / weave

Simple, resilient multi-host containers networking and more.
https://www.weave.works
Apache License 2.0
6.61k stars 668 forks source link

don't lie about source of injected icmp 3.4 #3

Open rade opened 10 years ago

rade commented 10 years ago

In order to fix issue #1, we are now "lying" about the origin of icmp 3.4 packets. This is somewhat distasteful and could also cause real issues since now a sender may get multiple icmp 3.4 purporting to come from the same ip, successively lowering the mtu, when in fact they were issued from the same hop.

As @msackman commented in issue #1, we could equip weave with an ip in each sub-net, and then pick the ip based on the the sub-net of the dst. The obvious downside here is that we need to add an ip whenever a sub-net is added, so sub-net addition suddenly becomes more involved.

So we really want to come up with a different solution. One starting point might be to investigate further why the packets got dropped in the first place. After all, their dst ip was in the correct sub-net, and the src ip was in the weave sub-net.

dpw commented 9 years ago

My opinion on this is that, given weave's current approach, the current behaviour is the least worst option (with the exception of broadcast frames as discussed in #419). The lie about the source of injected frag-needed ICMP packets comes about because by fragmenting packets, weave makes a lie of its claim to be an L2 network. Introducing intermediate addresses would only compound that lie. It would be better to investigate approaches that make weave a more honest L2 network, or involve the explicit incorporation of L3 features.

dpw commented 9 years ago

One starting point might be to investigate further why the packets got dropped in the first place. After all, their dst ip was in the correct sub-net, and the src ip was in the weave sub-net.

Reverse path filtering, I'd guess.

rade commented 9 years ago

The lie about the source of injected frag-needed ICMP packets comes about because by fragmenting packets, weave makes a lie of its claim to be an L2 network.

That's not quite accurate; the injection of "frag-needed" happens only when weave is not doing its own fragmentation, i.e. when "do not fragment" is set and the packet is too large.

dpw commented 9 years ago

The lie about the source of injected frag-needed ICMP packets comes about because by fragmenting packets, weave makes a lie of its claim to be an L2 network.

That's not quite accurate; the injection of "frag-needed" happens only when weave is not doing its own fragmentation, i.e. when "do not fragment" is set and the packet is too large.

They might not occur together for a particular packet, but the one arises because of the other: If weave did not have its IP fragmentation feature, it would also never need to inject frag-needed packets.

rade commented 9 years ago

If weave did not have its IP fragmentation feature, it would also never need to inject frag-needed packets.

That is incorrect.

Weave only fragments packets itself when a) the packet wasn't sent with "do not fragment", and b) the packet is too large, and c) weave has determined that the network stack does not perform fragmentation correctly.

By contrast, "frag needed" is injected when "do not fragment" is set and the packet is too large.

If weave could always rely on the network stack to fragment too-large non-df packets, then weave would never need to fragment packets. Yet it would still need to inject "frag needed".