Closed bemasc closed 1 year ago
Sorry, but the RFC 7383 has discussed this fragmentation problem in IKEv2. This PMTUD should start after IKE_SA_INIT as it says in Section 2.5.2, not CHILD_SA_INIT which I didn't find in IKEv2 protocol | RFC 7296.
It says in RFC 7383 that in most cases, only the IKE_AUTH phase needs the PMTU probe. And shall we distinguish IPv4 and IPv6 for the MTU? The limitation of 1280 B is the minimum MTU for IPv6. For IPv4, the minMTU is 576 B described in Section 2.5.1 | RFC 7383.
I also find there is an active draft for IKEv2 MTU dectection.
Well in the data plane, we don't modify the AH/ESP header if we replace RISAV-AH with the standard AH (#24), so we may use the PMTUD directly or the traditional configurations for IPsec. In other words, I think RISAV is still a standard IPsec in the data plane.
And here is what Cisco recommends for the MTU configurations of GRE + IPsec in IPv4 fragmentation, which are some best current practices I think. pmtud-ipfrag.
And shall we distinguish IPv4 and IPv6 for the MTU?
Yes, I was just using IPv6 numbers for simplicity.
RFC 7383 has discussed this fragmentation problem in IKEv2
Interesting, thanks. We should definitely consider RFC 7383 Section 2.5.2 when writing about PMTU discovery. However, I think the requirements here are different, because I am proposing to use the resulting MTU estimate on the data plane. For example, RFC 7383 recommends using very approximate PMTUD, because only a small amount of data is transferred, but in this case the discovered MTU will apply to a large amount of data.
I think RISAV is still a standard IPsec in the data plane.
Yes, this MTU logic doesn't alter the ordinary operation of the data plane (although the proposed ICMP rewriting for AH is novel).
And here is what Cisco recommends for the MTU
Thanks, that reminds me about TCP MSS. I've edited the proposal to include MSS clamping.
I mentioned earlier that Packet Too Big (PTB) ICMP messages are easily forged by an off-path attacker. During normal PMTUD, PTB forgery is prevented by the entropy of the original packet, which must be echoed in the response. If the PTB message doesn't match a packet that was recently sent, it is ignored. The ASBR cannot apply this defense, because it has no memory of the packet that was sent (which may not have even passed through this ASBR, if ECMP clustering or multicore implementation is in use).
Maybe a better solution is to say: if an ASBR receives a PTB response indicating a PMTU that is less than the current MTU estimate, it performs its own explicit PMTUD. This is not vulnerable to off-path attacks. However, it is somewhat unusual: we want the "inter-AS" PMTU, without regard to the packet handling inside the target AS. Therefore, the measurement proceeds as follows:
This PMTU value is the one used to update the current PMTU estimate, etc.
When I send a traceroute packet, I may get a private IP address. I try to traceroute google.com with a online mtr, and the result shows that the first 4 response IPs are all private IPs. Of course, private IP must be used inside the local network. I don't know if it is a particular case in CERNET or just all the same in most ISPs' networks. Maybe I should also compile a mtr in my local machine :joy:.
In this case, the sending ASBR is using a public source IP address, so the entire traceroute will return public IP addresses.
I think we can use "authenticated rejection" (#6) to solve the MTU problem:
minMTU
), which must be at least 1280.ACS_A
toACS_B
. We call this valueMTU[A->B]
.MTU[A->B] - 24 >= minMTU[A]
, A can offer RISAV to B in transport mode. IfMTU[A->B] - 73 >= minMTU[A]
, it can also offer tunnel mode.MTU[B->A]
. It accepts only SAs that are compatible withminMTU[B]
, and rejects the others. If none are allowable, they are all rejected.minMTU
, it should inform the ACS. The ACS can then reduce its estimate ofMTU[X->Y]
and use IKEv2 to terminate this SA if that value is too small.In transport mode, ICMP Packet Too Big responses are forwarded through the ASBR, stripping the echoed RISAV-AH header. This allows PMTUD to work as usual for the endpoints.
In tunnel mode, each ASBR maintains a PMTU estimate for each SA, which is initialized to the
MTU[X->Y]
value used during the handshake. Packets exceeding this size are dropped and produce a Packet Too Big response from the ASBR. If the ASBR receives a Packet Too Big response for its own IPsec packets, it reduces its local MTU estimate for this SA. (This can happen if the initial MTU estimate is wrong for this path, or the path MTU changes.) ASBRs MAY also run their own PMTUD for their SAs.This arrangement ensures that AS pairs with a consistent inter-AS MTU never reduce the end-to-end MTU below the value that is intended by either AS. If the MTU is variable or heterogeneous, this arrangement ensures that PMTUD continues to work correctly for endpoints. If the MTU on any path falls below the required minimum, RISAV will be disabled within ~1 second. The addition of MSS clamping ensures that non-PMTUD-capable TCP clients don't attempt to use large packets that will not work.
My biggest concern about this approach is that it enables some downgrade attacks:
A. A transit provider could simply reduce the actual MTU to 1280 in order to cause RISAV to be disabled automatically. B. An off-path attacker could send a Packet Too Big response that contains a packet from the other AS.
Attack A is probably acceptable for now (i.e. out of scope). Attack B is more concerning, and I'm not sure how to mitigate it.