* Section 7: should OCS be mandatory under circumstances other than UDP CS <> 0?

tsvwg / draft-ietf-tsvwg-udp-options

0 stars 0 forks source link

* Section 7: should OCS be mandatory under circumstances other than UDP CS <> 0? #12

Closed Mike-Heard closed 7 months ago

Mike-Heard commented 12 months ago

The issue was raised by Tom Herbert and also by Carlos Pignataro during his INTAREA early review. See:

https://mailarchive.ietf.org/arch/msg/tsvwg/nwo2O06anyy27kpF1B4AYJtY-kI/

Mike Heard defends a contrary view in:

https://mailarchive.ietf.org/arch/msg/tsvwg/FNXIJdJpc7Su5BhyPHCTIxgec5w/

Tom Herbert (In https://mailarchive.ietf.org/arch/msg/tsvwg/V_oIxX8itTU0LPAI7P6jehCi_kw/) suggested this:

If a zero checksum in the surplus area is allowed then IMO there should be requirements in the draft similar to those of RFC6936 but specific to the risks and mitigation for a zero checksum in the surplus area.

jtouch commented 9 months ago

Added a paragraph to the OCS section to address this in -23.

gorryfair commented 9 months ago

Sorry, I'm not so sure we have resolved this yet. I see -23 added some analysis which helps unpick the issue: “ The benefits are similar to allowing UDP checksums to be zero, but
the risks differ. OCS is additionally important to ensure packets
with UDP options can traverse misbehaving middleboxes [Zu20]. When
the cost of computing OCS is negligible, it is better to use OCS to
ensure such traversal. In cases where such traversal risks can
safely be ignored, such as controlled environments, over paths where traversal is validated, or where upper layer protocols
(applications, libraries, etc.) can adapt (by enabling OCS when
packet exchange fails), and when bit errors at the UDP layer would
be detected by other layers (as with the UDP checksum) OCS can be
disabled, e.g., to conserve energy or processing resources or when
it can improve performance. This is why zeroing OCS is only safe when UDP checksum is also zero, but why OCS might still be used in
that case.”

My recollection was that it was agreed that OCS would be required, and -23 has now listed ways in which this might not be required, so I am going to push-back by saying I think this new flexibility is undesirable in the spec.

(a) The new text says: “When the cost of computing OCS is negligible, it is better to use OCS to ensure such traversal.” I don’t agree that we said only when the cost was negligible. (b) I can see implied similarity to RFC6936 - but in that case the principle exception was for in-network tunnels where the encaps/decaps had no access to the payload of the packet it was processing. That is not the case for UDP Options which is end-to-end. (c) Specific examples: “such as controlled environments, I think use in controlled environments is fine, providing *there is a way to restrict packets using this to only those cases” … in reality this applies to any spec, so we could equally not say that. (d) " over paths where
traversal is validated, or where upper layer protocols
(applications, libraries, etc.) can adapt (by enabling OCS when
packet exchange fails), and when bit errors at the UDP layer would
be detected by other layers (as with the UDP checksum) OCS can be
disabled, e.g., to conserve energy or processing resources or when
it can improve performance. “

I do not agree with this design for upper layer protocols (applications, libraries, etc.) to adapt (by enabling OCS when packet exchange fails) … if we say this, we ought to also define how such probing is used to detect path changes and help the endpoint rebuild the path - this sounds a lot like keep alive functions etc. Why is this good design? (e) I question that is likely to substantially improve performance when you already have to marshall the data into the options space, how actually important is the checksum calculation for an OCS that only covers the surplus area? (f) I also have comments on the text on bit-errors and other ways to detect. UDP Checksums provide very weak detection of bit errors, they mainly protect from data marshalling mistakes - and we ought not to be even hinting at this.. It does not explain how other layers will provide this.

jtouch commented 9 months ago

First, my understanding of consensus was - as the title of this thread - that OCS was mandatory when UDP CS<>0. However, the pseudocode had mentioned that there was no benefit to OCS as a middle box-protection OR bit error protection OR even a legacy endpoint protection when doing frag/reassy, because the packet-level OCS is redundant with the frag-level OCS: bit errors would be caught by frag-level OC; the packet-level OCS is handled only at the endpoints, so it doesn't help with middlebox traversal; and legacy protections don't need to apply because legacy receivers wouldn't be doing reassembly.

My understanding of the ASK was to explain better how OCS != 0 was not quite the same as CS !=0, which I did. I also explained the issue above (in this comment) better, which was already in the pseudocode - and arguably is equivalent to the UDP CS=0 case for tunnels, i.e., where other protections are enough and OCS is redundant.

The only thing that changed was the OCS field is required.

tompandadev commented 9 months ago

Per the subject, the ASK is whether the surplus area checksum should be mandatory in ALL use cases of UDP Options.

Requiring the checksum minimizes the probablilty of misinterpretation of the surplus as UDP Options when it contains something else. It may contain other unrelated data becasue there was never a prohibtion against putting other data in the surpuls space, and what is not explicitly forbidden is allowed.

jtouch commented 9 months ago

It's definitely not needed in the original packet if fragmentation is used. Its purpose was never really to protect against other uses of the area; there have been no reported uses and every other aspect of the options would have to be valid for misinterpretation to occur anyway. I.e., the entire option serves as somewhat of a check. As to future uses, there should not be any as this document defines the entire area for use by UDP options only (that's why this updates RFC 768).

We've been over this before in the WG.

gorryfair commented 9 months ago

I think you have this the wrong way around - if we wish the fragments to traverse an Internet Path, then it would be wise to include the OCS. That helps traversal and helps assure that this is a UDP-Options area.

jtouch commented 9 months ago

In a fragmented UDP packet, there are two types of OCS uses: the OCS in the overall packet and the OCS in each fragment. The fragment OCS is needed for traversal AND protects the user data (because it's in the surplus area of each FRAG) but the overall OCS has no impact on traversal or data protection at all. That's why it's both not useful and similar to UDP CS in a tunnel with other data protection, and why it can be omitted (set to zero) for the overall datagram (but NOT the FRAGs).

tompandadev commented 9 months ago

On Mon, Sep 18, 2023 at 7:45 AM Joe Touch @.***> wrote:

In a fragmented UDP packet, there are two types of OCS uses: the OCS in the overall packet and the OCS in each fragment. The fragment OCS is needed for traversal AND protects the user data (because it's in the surplus area of each FRAG) but the overall OCS has no impact on traversal or data protection at all. That's why it's both not useful and similar to UDP CS in a tunnel with other data protection, and why it can be omitted (set to zero) for the overall datagram (but NOT the FRAGs).

Joe,

Okay, then the OCS should be required for each fragment. It's already required for IPv6 packets per the draft since if UDP CS is non-zero then the OCS is required and UDP CS is required for UDP in IPv6. There is no performance cost since devices will be able to offload the checksum, and it provides a strong check against misinterpretation of the surplus area even in the case UDP Length equals eight.

Tom

—

Reply to this email directly, view it on GitHub https://github.com/tsvwg/draft-ietf-tsvwg-udp-options/issues/12#issuecomment-1723594453, or unsubscribe https://github.com/notifications/unsubscribe-auth/ASQYCKZQ56GCEOBT7J4L7A3X3BNBLANCNFSM6AAAAAAZ5CP5JI . You are receiving this because you commented.Message ID: @.***>

jtouch commented 9 months ago

UDP CS<>0 is NOT required for IPv6 because there's an exception in RFC6936. We have the same exception for different reasons at the UDP packet layer, but not at the FRAG layer. For fragments (where UDP length ==8 AND IP length indicates a payload longer than the UDP length indicates), then (MUST) OCS!=0.

NB: there are other cases where UDP length=8, i.e., where there are no UDP options at all. UDP allows such "no data" packets so we have to be clear that it's not just length=8 that triggers OCS rules.

Performance cost cannot be asserted; not all devices have offload support.

I can make that more clear in -24.

tompandadev commented 9 months ago

On Mon, Sep 18, 2023 at 12:49 PM Joe Touch @.***> wrote:

UDP CS<>0 is NOT required for IPv6 because there's an exception in RFC6936.

UDP CS <> 0 IS REQUIRED for IPv6. From STD 86, RFC8200 section 8.1:

"Unlike IPv4, the default behavior when UDP packets are originated by an IPv6 node is that the UDP checksum is not optional. That is, whenever originating a UDP packet, an IPv6 node must compute a UDP checksum over the packet and the pseudo-header"

RFC6936 is an exception for the use case of UDP tunnels. However the use case is intended to be narrow, and just using a UDP tunneling protocol is not sufficient. The ten requirements in section 5 of RFC6936 need to be satisfied in the context of implementation and deployment. And even with all of that, specific tunneling protocols still need their own requirements to allow a zero UDP checksum like in section 6.2 of RFC8086.

We have the same exception for different reasons at the UDP packet layer, but not at the FRAG layer. For fragments (where UDP length ==8 AND IP length indicates a payload longer than the UDP length indicates), then (MUST) OCS!=0.

NB: there are other cases where UDP length=8, i.e., where there are no UDP options at all. UDP allows such "no data" packets so we have to be clear that it's not just length=8 that triggers OCS rules.

Performance cost cannot be asserted; not all devices have offload support.

I claim it can be. The other major transport protocol that has options is TCP, and TCP has required a checksum that covers the pseudo header, TCP header including options, and TCP payload from day one. Offload support is widespread and I don't see anyone complaining or asking to allow the TCP checksum to be optional.

Regardless of any other considerations, the real question is whether it is robust to make the OCS optional. The draft states that "The primary purpose of the OCS is to detect non-standard (i.e., non-option) uses of that area". If the OCS is optional then when it's not set it can't detect non-standard uses which potentially allows misinterpretation by a receiver and possible detrimental effects.

The justification that it's okay not to use the surplus checksum is because we don't know of any pre-existing use cases. That argument is based on anecdotal empirical data which only reports on a small fraction of the Internet, we cannot conclusively know that there are no use cases. The problem is that if a user ever hits some other use case and UDP Options are enabled which leads to detrimental behaviors like data corruption, then that's a bug. And note, this wouldn't be a bug in the implementation or configuration-- it would be a bug in the protocol itself.

IMO, allowing the OCS to be optional on the basis that we think it probably won't ever be a problem, doesn't meet the bar for robustness for an IETF Standards Track protocol. I believe the required checksum would meet the bar, or a magic number with enough bits could meet the bar also if the computational cost of the checksum really is a concern.

Tom

I can make that more clear in -24.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

jtouch commented 9 months ago

We've had these discussions before. This is a User option space - if users want to disable it, they should be allowed. Specific cases where it makes sense to disable OCS at the packet level:

when FRAG is used, each FRAG has its own OCS which covers the entire packet AND the options (in the last frag(s).
when AUTH or UENC is/will be used, which are typically much stronger

Every protocol has abuses - squatting on option codepoints (TCP) or whole services that don't belong there (ports). We don't do anything systematic to prevent or detect that - and those are KNOWN uses. There are no known uses of the option area other than as defined here, so what phantom are you trying to detect? It can't be future uses; those could happen for any new protocol or option - and they would ALL need magic numbers or codepoints.

IMO, the introductory text is inaccurate; the PRIMARY reason for including OCS and enabling it by default are non-compliant middleboxes. The original idea that this was really there to detect abuses or alternate uses is IMO silly; they would have to parse perfectly, which is probably at least as high a bar as the checksum, if not higher.

tompandadev commented 9 months ago

On Mon, Sep 18, 2023 at 4:58 PM Joe Touch @.***> wrote:

We've had these discussions before. This is a User option space - if users want to disable it, they should be allowed. Specific cases where it makes sense to disable OCS at the packet level:

when FRAG is used, each FRAG has its own OCS which covers the entire packet AND the options (in the last frag(s). when AUTH or UENC is/will be used, which are typically much stronger

Every protocol has abuses - squatting on option codepoints (TCP) or whole services that don't belong there (ports). We don't do anything systematic to prevent or detect that - and those are KNOWN uses. There are no known uses of the option area other than as defined here, so what phantom are you trying to detect? It can't be future uses; those could happen for any new protocol or option - and they would ALL need magic numbers or codepoints.

The UDP surplus area has existed since RFC768 for forty-three years, and I don't believe there has ever been any published normative requirements that would restrict the contents of the surplus area. If a user wants to place arbitrary data in the surplus area, there is no RFC that says they can't do that. So if someone is doing that in their proprietary network then that's their prerogative, and AFAICT, placing arbitrary data in the surplus area is protocol conformant with all existing RFCs and abuses nothing. If UDP Options are deployed on such a proprietary network and that somehow breaks communications, then IMO it is squarely the fault of the UDP Options Protocol (not the user, the configuration, or implementation).

The receive clause of the robustness principle states "Be liberal in what you receive". In the case of UDP Options, I would interpret this to mean that a receiver MUST be able to correctly process, without ill effects, a packet with a surplus area that contains something other than UDP Options. Given that there is no unambiguous code point, the best we can do is use something like a checksum or magic number that will at least establish robustness with some sufficiently high probability.

IMO, the introductory text is inaccurate; the PRIMARY reason for including OCS and enabling it by default are non-compliant middleboxes. The original idea that this was really there to detect abuses or alternate uses is IMO silly; they would have to parse perfectly, which is probably at least as high a bar as the checksum, if not higher.

The probability of a checksum or magic number matching for a false positive is quantifiable. For instance, the probability of misinterpretation with a checksum is 1/65,535. If you can quantify the probability of misinterpretation given that "they would have to parse perfectly" then that might be a valid argument to support that it's as robust or more robust than the checksum.

Tom

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

jtouch commented 9 months ago

I welcome NEW points to this discussion.

gorryfair commented 9 months ago

I had a hard time untangling the pseudocode around the checksum, and I still think we are marking this harder than we need by adding too much flexibility for only small benefits.

I'm going to re-suggest that we don't allow the OCS to be disabled, but we ignore the OCS when the CS is 0.

The code is then much simpler:

       if ( UDP CS Fails ) then
           silently drop the entire UDP packet (per RFC1122)
       else {
        if (OCS Passes or UDP CS == 0) then
               deliver the UDP user data after parsing
               and processing the rest of the options,
               regardless of whether each is supported or succeeds
               (again, this is required to emulate legacy behavior)
         else
               deliver the UDP user data, but ignore other options
               (this is required to emulate legacy behavior) }

Nobody has to check whether the OCS is needed for the options contained, although there is still the case for options that follow a reassembled packet - where the options area has already been checksummed.

Mike-Heard commented 9 months ago

Gorry wrote:

My recollection was that it was agreed that OCS would be required, and -23 has now listed ways in which this might not be required, so I am going to push-back by saying I think this new flexibility is undesirable in the spec.

My understanding is that we agreed that OCS would be required when UDP CS was present (i.e., non-zero), including in cases where UDP Length == 8. There was residual disagreement about the case UCP CS == 0.

I have a hard time seeing why OCS == 0 should be banned IF one is communicating (a) over a "clean" path (i.e., one known to be largely free of errors) and (b) with an endpoint know to support UDP options. In that situation, there is minimal hazard being corrupted in transit or if it being being misinterpreted as containing UDP options when in fact it does not.

IT IS TRUE that this special situation does not hold in general, For that reason, the DEFAULT should be to require OCS != 0 except in reassembled datagrams. The API, however, should allow this default to be overridden on a per-socketpair basis. It is then the user's responsibility to enable this override only when the pre-conditions are satisfied.

A specific use case for UDP CS == 0 and OCS == 0 would be UDP tunnels in a traffic-managed controlled environment (TMCE) that employ UDP fragmentation.

jtouch commented 9 months ago

Agreed with Mike; this was never about OCS==0 when UDP CS<>0, except that I realized it might be useful for datagram reassembly even when UDP CS<>0....(see below further).

So focusing on the text, I thought we had agreed on the bold in Mike's post above: when UDP CS==0, OCS defaults to nonzero and MAY be overridden per-socketpair.

I do NOT support forcing OCS !=0 when UDP CS==0 under any circumstances.

I also now believe we should leave OCS for the entire packet even for FRAG, to allow FRAG to be handled independently. Otherwise, an incoming packet that has been reassembled could have OCS==0 but CS!=0 and wouldn't know that this is not allowed if the packet wasn't already assembled.

Mike-Heard commented 9 months ago

So focusing on the text, I thought we had agreed on the bold in Mike's post above: when UDP CS==0, OCS defaults to nonzero and MAY be overridden per-socketpair.

FWIW I had intended this to mean: the default is for the transmitter to prepare a correct/non-zero OCS and for the receiver to require the same. And since UDP is unidirectional, the override should be per-direction (i.e., separately for transmit and receive). Sorry for the lack of clarity.

Mike-Heard commented 9 months ago

Joe wrote:

I also now believe we should leave OCS for the entire packet even for FRAG, to allow FRAG to be handled independently. Otherwise, an incoming packet that has been reassembled could have OCS==0 but CS!=0 and wouldn't know that this is not allowed if the packet wasn't already assembled.

Maybe there is something that I don't understand, but if this is saying that OCS for the entire packet always needs to be computed if the packet is fragmented, then I disagree.

First, note that the OCS for the entire packet can't be processed until it the entire (original/reassembled) packet has been reassembled because: (a) the option area may not be confined to a single fragment and (b) it is not known where the user data/option boundary is until the terminal fragment is received.

Second, note that if UCP CS != 0 for every fragment, then it must also be the case OCS != 0 on every fragment, or else one or more fragments will be discarded and reassembly will fail.

Third, it is possible for the reassembly stage to keep track of whether OCS != 0 on every and pass this information on to the next stage where the options for the reassembled packet are processed.

And finally, when OCS !=0 for every fragment, the options for the entire packet are protected, and furthermore, there is no need for disambiguation since a successful reassembly proves that the remote end understands UDP options. So calculating a separate OCS for the option area of the entire packet serves no useful purpose in this case.

On that basis, I think that the following text proposed for Section 9.4 (FRAG) by Issue #20 is correct:

  Similarly, the OCS value of the original packet SHOULD be zero
  if each fragment will have a non-zero OCS value, as will be the
  case if each fragment’s UDP checksum is non-zero.

That being said, it does seem that the -23 draft (even with the text proposed by Issue #20) does not completely specify whether OCS != 0 for the entire packet is required when only some of the fragments have UDP CS == 0, nor does it fully specify default behavior. I suggest:

The DEFAULT is to require OCS != 0 on the options of the entire packet except when all fragments have OCS != 0. Note that the reassembly logic will need to pass that information to the logic that processes the options of the entire packet. This default may be overridden on a per-socketpair (same as for the non-fragmented case).
When the override is invoked, the receiver will allow OCS == 0 on the entire packet without further conditions. However, if OCS != 0, the receiver is obliged to check it and to discard the options if the check fails.

The default suggested above does allow for some cases where the options for the entire packet could be double-protected, e.g., OCS != 0 on the final fragment only and all the options of the entire packet are in that fragment. It does not seem (to me) to be worth extra effort to account for this.

The override behavior basically treats a reassembled packet with some fragments having UDP CS == 0 the same as an unfragmented packet with UDP CS == 0. The conditions under which it is reasonable to invoke the override -- communicating over a "clean" path and with an endpoint know to support UDP options -- clearly apply in this case, too.

tompandadev commented 9 months ago

On Tue, Sep 19, 2023 at 1:26 PM Mike-Heard @.***> wrote:

Joe wrote:

I also now believe we should leave OCS for the entire packet even for FRAG, to allow FRAG to be handled independently. Otherwise, an incoming packet that has been reassembled could have OCS==0 but CS!=0 and wouldn't know that this is not allowed if the packet wasn't already assembled.

Maybe there is something that I don't understand, but if this is saying that OCS for the entire packet always needs to be computed if the packet is fragmented, then I disagree.

First, note that the OCS for the entire packet can't be processed until it the entire (original/reassembled) packet has been reassembled because: (a) the option area may not be confined to a single fragment and (b) it is not known where the user data/option boundary is until the terminal fragment is received.

Second, note that if UCP CS != 0 for every fragment, then it must also be the case OCS != 0 on every fragment, or else one or more fragments will be discarded and reassembly will fail.

Third, it is possible for the reassembly stage to keep track of whether OCS != 0 on every and pass this information on to the next stage where the options for the reassembled packet are processed.

And finally, when OCS !=0 for every fragment, the options for the entire packet are protected, and furthermore, there is no need for disambiguation since a successful reassembly proves that the remote end understands UDP options. So calculating a separate OCS for the option area of the entire packet serves no useful purpose in this case.

On that basis, I think that the following text proposed for Section 9.4 (FRAG) by Issue #20 https://github.com/tsvwg/draft-ietf-tsvwg-udp-options/issues/20 is correct:

Similarly, the OCS value of the original packet SHOULD be zero if each fragment will have a non-zero OCS value, as will be the case if each fragment’s UDP checksum is non-zero.

That being said, it does seem that the -23 draft (even with the text proposed by Issue #20 https://github.com/tsvwg/draft-ietf-tsvwg-udp-options/issues/20) does not completely specify whether OCS != 0 for the entire packet is required when only some of the fragments have UDP CS == 0, nor does it fully specify default behavior. I suggest:

-

The DEFAULT is to require OCS != 0 on the options of the entire packet except when all fragments have OCS != 0. Note that the reassembly logic will need to pass that information to the logic that processes the options of the entire packet. This default may be overridden on a per-socketpair (same as for the non-fragmented case).

When the override is invoked, the receiver will allow OCS == 0 on the entire packet without further conditions. However, if OCS != 0, the receiver is obliged to check it and to discard the options if the check fails.

The default suggested above does allow for some cases where the options for the entire packet could be double-protected, e.g., OCS != 0 on the final fragment only and all the options of the entire packet are in that fragment. It does not seem (to me) to be worth extra effort to account for this.

The override behavior basically treats a reassembled packet with some fragments having UDP CS == 0 the same as an unfragmented packet with UDP CS == 0. The conditions under which it is reasonable to invoke the override -- communicating over a "clean" path and with an endpoint know to support UDP options -- clearly apply in this case, too.

What does a "clean" mena in this context?

Tom

— Reply to this email directly, view it on GitHub https://github.com/tsvwg/draft-ietf-tsvwg-udp-options/issues/12#issuecomment-1726420966, or unsubscribe https://github.com/notifications/unsubscribe-auth/ASQYCK6V7AULL4N2JYIHKG3X3H5ZDANCNFSM6AAAAAAZ5CP5JI . You are receiving this because you commented.Message ID: @.***>

Mike-Heard commented 9 months ago

What does a "clean" mean in this context? Tom

One known to be largely free of errors.

More fully, in an earlier comment I said:

I have a hard time seeing why OCS == 0 should be banned IF one is communicating (a) over a "clean" path (i.e., one known to be largely free of errors) and (b) with an endpoint know to support UDP options. In that situation, there is minimal hazard [of the surplus area] being corrupted in transit or if it being being misinterpreted as containing UDP options when in fact it does not.

(italicized text inadvertently omitted in the original)

Mike

tompandadev commented 9 months ago

On Tue, Sep 19, 2023 at 2:08 PM Mike-Heard @.***> wrote:

What does a "clean" mean in this context? Tom

One known to be largely free of errors.

More fully, in an earlier comment I said:

I have a hard time seeing why OCS == 0 should be banned IF one is communicating (a) over a "clean" path (i.e., one known to be largely free of errors) and (b) with an endpoint know to support UDP options. In that situation, there is minimal hazard [of the surplus area] being corrupted in transit or if it being being misinterpreted as containing UDP options when in fact it does not.

Mike,

The problem is that misinterpretation can happen on a completely lossless, error free-network, and in fact can happen when all senders are protocol conformant as I described previously.

Also, while requiring the OCS == 0 to be explicitly configured at senders and receivers, there is still significant risk. Unconnected UDP is common, so a single receiver socket may receive packets for thousands of sources-- for instance, OCS==0 could be enabled for DNS port 53. This is different from the case of tunnels and the rationale of RFC 6936 since tunnels are typically well defined end points that are known up front (not unconnected UDP).

A related issue is the discrepancy between IPv4 and IPv6 with regards to setting the UDP CS to zero. UDP checksum is optional in IPv4 but required in IPv6 (with the exception of RFC6936 which is only applicable to tunnels and represents a tiny fraction of UDP traffic on the Internet). So if OCS is required for IPv6 but not for IPv4, what does that mean in terms of protocol evolution and transition? I believe it would be a factual statement to say that use of UDP Options in IPv6 is more robust than IPv4 (good for promoting IPv6 adoption at least!). But what if someone is transitioning from IPv4 to IPv6? Are they going to be unhappy that they can no longer use the zero OCS because their performance dropped due to now having to calculate the OCS checksum?

One solution is to only allow OCS to be zero only with UDP tunnels, including IPv4, as described in RFC6936. This minimizes the risk of misinterpretation only to tunnel use cases, creates symmetry between IPv4 and IPv6, and allows OCS to be optional in the one use case where computing the checksum might be prohibitive-- namely routers that are endpoints for UDP tunnels.

Tom

(italicized text inadvertently omitted in the original)

Mike

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

Mike-Heard commented 9 months ago

Tom Herbert wrote:

Mike-Heard wrote:

More fully, in an earlier comment I said: I have a hard time seeing why OCS == 0 should be banned IF one is communicating (a) over a "clean" path (i.e., one known to be largely free of errors) and (b) with an endpoint known to support UDP options. In that situation, there is minimal hazard [of the surplus area] being corrupted in transit or if it being being misinterpreted as containing UDP options when in fact it does not. Mike, The problem is that misinterpretation can happen on a completely lossless, error free-network, and in fact can happen when all senders are protocol conformant as I described previously.

Tom, what part of "with an endpoint known to support UDP options" was not clear to you?

Also, while requiring the OCS == 0 to be explicitly configured at senders and receivers, there is still significant risk. Unconnected UDP is common, so a single receiver socket may receive packets for thousands of sources-- for instance, OCS==0 could be enabled for DNS port 53. This is different from the case of tunnels and the rationale of RFC 6936 since tunnels are typically well defined end points that are known up front (not unconnected UDP).

My earlier comment also said:

IT IS TRUE that this special situation does not hold in general, For that reason, the DEFAULT should be to require OCS != 0 except in reassembled datagrams. The API, however, should allow this default to be overridden on a per-socketpair basis. It is then the user's responsibility to enable this override only when the pre-conditions are satisfied.

I will thank you to please respond to what I actually wrote.

Here's a link to the whole comment: https://github.com/tsvwg/draft-ietf-tsvwg-udp-options/issues/12#issuecomment-1725471717

tompandadev commented 9 months ago

On Tue, Sep 19, 2023 at 3:13 PM Mike-Heard @.***> wrote:

Tom Herbert wrote:

Mike-Heard wrote:

More fully, in an earlier comment I said: I have a hard time seeing why OCS == 0 should be banned IF one is communicating (a) over a "clean" path (i.e., one known to be largely free of errors) and (b) with an endpoint known to support UDP options. In that situation, there is minimal hazard [of the surplus area] being corrupted in transit or if it being being misinterpreted as containing UDP options when in fact it does not. Mike, The problem is that misinterpretation can happen on a completely lossless, error free-network, and in fact can happen when all senders are protocol conformant as I described previously.

Tom, what part of "with an endpoint known to support UDP options" was not clear to you?

Mike,

Relative to how I would map these requirements into a real implementation like Linux, this isn't clear. And note term socket pair has different meaning in networking stack implementation, so that also requires interpretation wrt implementation.

I think what you might be suggesting is that OCS could be optional for UDP tunnels, and applications that used connected UDP sockets. For all other use cases, OCS must be set.

Is this what you're thinking?

Tom

Also, while requiring the OCS == 0 to be explicitly configured at senders

and receivers, there is still significant risk. Unconnected UDP is common, so a single receiver socket may receive packets for thousands of sources-- for instance, OCS==0 could be enabled for DNS port 53. This is different from the case of tunnels and the rationale of RFC 6936 since tunnels are typically well defined end points that are known up front (not unconnected UDP).

My earlier comment also said:

IT IS TRUE that this special situation does not hold in general, For that reason, the DEFAULT should be to require OCS != 0 except in reassembled datagrams. The API, however, should allow this default to be overridden on a per-socketpair basis. It is then the user's responsibility to enable this override only when the pre-conditions are satisfied.

I will thank you to please respond to what I actually wrote.

Mike,

I am trying to map your requirements into how we'd do it in implementation. In Linux, there is no such thing as socket-pair. There are connected and unconnected sockets. I think what you're suggesting is that we would onlyu

Here's a link to the whole comment: #12 (comment) https://github.com/tsvwg/draft-ietf-tsvwg-udp-options/issues/12#issuecomment-1725471717

— Reply to this email directly, view it on GitHub https://github.com/tsvwg/draft-ietf-tsvwg-udp-options/issues/12#issuecomment-1726612298, or unsubscribe https://github.com/notifications/unsubscribe-auth/ASQYCK2BRCOB5T7VXKLEWLTX3IKJTANCNFSM6AAAAAAZ5CP5JI . You are receiving this because you commented.Message ID: @.***>

Mike-Heard commented 9 months ago

Mike, I am trying to map your requirements into how we'd do it in implementation. In Linux, there is no such thing as socket-pair. There are connected and unconnected sockets. [...] I think what you might be suggesting is that OCS could be optional for UDP tunnels, and applications that used connected UDP sockets.

If the prerequisites of a clean path and a remote endpoint known to support UDP options are satisfied, then those would be situations where it would be safe to disable generation of OCS on transmit and/or checking of OCS on receive, subject to the proviso that the UDP checksum generation and/or checking was also disabled in tandem.

For all other use cases, OCS must be set.

Not wishing to rule out other use cases (e.g., multicast), I would say "should" rather than "must."

jtouch commented 7 months ago

I've lost the thread of what actually needs to happen here. I continue to see disagreement, but was there consensus that can be summarized??

Mike-Heard commented 7 months ago

My understanding is that current status if this issue is as follows:

There seems to be general agreement agreement that OCS must be present (i.e., non-zero) when UDP CS!=0. The principal reason for imposing this requirement was to overcome issues with middleboxes that do not perform checksum calculations correctly in the presence of a non-empty surplus area. When OCS as as specified in the draft is present, this problem is overcome. An additional benefit is that common protocol-agnostic checksum offload remains effective if OCS is present when UDP CS!=0.
There is not universal agreement that it is OK for OCS to be absent (indicated by the value zero) when the UDP checksum is absent (also indicated by the value zero). Some reviewers (notably Tom Herbert, Gorry Fairhurst, Carlos Pignataro, but possibly others as well) have insisted that OCS must be present always, even if UDP CS==0, primarily in order to ensure that the surplus area doesn't contain something other than UDP options. Others (Joe Touch, Mike Heard) advocate to make it the user's choice whether or not OCS is present when UDP CS==0 (that means: sender's choice whether to include OCS!=0 when UDP CS==0, receiver's choice whether to accept OCS==0 when UCP CS==0). Mike Heard has suggested that the case of an RFC 6936-style UDP tunnel that uses UDP fragmentation is a valid use case for OCS==0 when UDP CS==0. That could apply more generally for communication with a known endpoint.

One possible way forward may be to include an appendix detailing the risks and benefits of OCS==0 when UDP CS==0 and providing requirements on what user choices an implementation must allow and spelling out what choices are and are not recommended. I will volunteer to try my hand at that to see if it garners consensus, but I can't get to it until December.

jtouch commented 7 months ago

OK, so in that case, I propose:

OCS MUST be non-zero when UDP CS !=0 -- but I think the text already has that
if UDP CS==0, OCS MAY be zero -- but we put in the caveat that the default is to make it non-zero and users wishing to override that do so at the risk of not having a check on non-option uses of the surplus area. One specific such use case is 6936-style tunnels.

I.e. the default does what some want and we caveat the distinction, but let it be the user's choice to take the risk.

If that's acceptable to move forward, one final question is whether there's an issue with the overall packet UDP CS / OCS vs. the fragment UDP CS / OCS. The UDP CS in frags doesn't do much (there's no data; it's just for the pseudo header). If the frags use OCS, then the reassembled packet should not benefit that much from a packet-level UDP CS, nor would it benefit much from the OCS of the option area after reassembly. So for fragments: Packet-level UDP CS==0 / OCS == 0, and frag-level OCS !=0. Frag level UDP CS seems like it could go either way. Thoughts??

gorryfair commented 7 months ago

"OK, so in that case, I propose:

OCS MUST be non-zero when UDP CS !=0 -- but I think the text already has that if UDP CS==0, OCS MAY be zero -- but we put in the caveat that the default is to make it non-zero and users wishing to override that do so at the risk of not having a check on non-option uses of the surplus area. One specific such use case is 6936-style tunnels. I.e. the default does what some want and we caveat the distinction, but let it be the user's choice to take the risk."

This much makes sense :-)

gorryfair commented 7 months ago

Part 2: "If that's acceptable to move forward, one final question is whether there's an issue with the overall packet UDP CS / OCS vs. the fragment UDP CS / OCS. The UDP CS in frags doesn't do much (there's no data; it's just for the pseudo header). If the frags use OCS, then the reassembled packet should not benefit that much from a packet-level UDP CS, nor would it benefit much from the OCS of the option area after reassembly. So for fragments: Packet-level UDP CS==0 / OCS == 0, and frag-level OCS !=0. Frag level UDP CS seems like it could go either way. Thoughts??"

I am not sure I understood the question.

Mike-Heard commented 7 months ago

"OK, so in that case, I propose:

OCS MUST be non-zero when UDP CS !=0 -- but I think the text already has that if UDP CS==0, OCS MAY be zero -- but we put in the caveat that the default is to make it non-zero and users wishing to override that do so at the risk of not having a check on non-option uses of the surplus area. One specific such use case is 6936-style tunnels. I.e. the default does what some want and we caveat the distinction, but let it be the user's choice to take the risk."

This much makes sense :-)

Modulo wordsmithing, I concur that this is a good way forward. However, I would like to see it made clear that that the implementation should allow the user the choice whether to accept a packet with OCS==0.

Mike-Heard commented 7 months ago

Part 2: "If that's acceptable to move forward, one final question is whether there's an issue with the overall packet UDP CS / OCS vs. the fragment UDP CS / OCS. The UDP CS in frags doesn't do much (there's no data; it's just for the pseudo header). If the frags use OCS, then the reassembled packet should not benefit that much from a packet-level UDP CS, nor would it benefit much from the OCS of the option area after reassembly. So for fragments: Packet-level UDP CS==0 / OCS == 0, and frag-level OCS !=0. Frag level UDP CS seems like it could go either way. Thoughts??"

I am not sure I understood the question.

Indeed, the question is confused: there is no means to transmit the pre-fragmentation UDP checksum of a fragmented packet, and therefore no question of when to include it or not. It is never included. If you have non-zero OCS on each fragment -- as must be the case when UCP CS of the fragments is non-zero -- then you get equivalent protection. See the comment on issue #20 regarding this point.

tompandadev commented 7 months ago

On Wed, Nov 8, 2023 at 7:39 AM Mike-Heard @.***> wrote:

"OK, so in that case, I propose:

OCS MUST be non-zero when UDP CS !=0 -- but I think the text already has that if UDP CS==0, OCS MAY be zero -- but we put in the caveat that the default is to make it non-zero and users wishing to override that do so at the risk of not having a check on non-option uses of the surplus area. One specific such use case is 6936-style tunnels. I.e. the default does what some want and we caveat the distinction, but let it be the user's choice to take the risk."

This much makes sense :-)

Modulo wordsmithing, I concur that this is a good way forward. However, I would like to see it made clear that that the implementation should allow the user the choice whether to accept a packet with OCS==0.

Mike,

The draft states:

"Like the UDP checksum, the OCS is optional under certain circumstances and contains zero when not used. UDP checksums can be zero for IPv4 [RFC791] and for IPv6 [RFC8200] when UDP payload already covered by another checksum, as might occur for tunnels [RFC6935]. The same exceptions apply to the OCS when used to detect bit errors;"

As I've mentioned before, RFC6935 and RFC6936 provides the guidance and requirements for setting the UDP checksum, there is nothing in those RFCs that discusses the UDP surplus area which not covered by the UDP checksum. A significant point of those RFCs is that if the UDP payload is already covered by another checksum then the risks of not using the UDP checksum are substantially reduce. But if the UDP payload checksum is covered by its own checksum, that has no implications for the OCS: it in no way affects the risk of bit errors or possbility of misinterpretation in the surplus area. So I believe that RFC6935 are RFC6936 are not applicable and saying the same exceptions for setting the UDP checksum to zero apply to OCS is either misleading or incorrect. RFC6935 and RFC6936 address a different problem and even if they were applicable it isn't enough to just refer to them, there would need to a set of requirements for how they are applied to the specific protocol (like in section 6.2 of RFC8086 (GRE/UDP) and section 3.1 of RFC7510 (MPLS/UDP) )

If "the OCS is optional under certain circumstances" then I suggest those certain cricumstances should be clearly articulated, the risks if the user doesn't use the OCS checksum should be highighted, and any potential mitigations to the risks should be mentioned. The structure of that might be similar to RFC6935 and RFC6936, but it needs to be tailored to the specifics of checksum over the surplus area (there is really no need to even reference RFC6935 or RFC6936 in the document).

I'm not opposed to letting the user make decisions, however if we allow that then it's incumbent on us to enable them to make informed decision. This is especially true in the case of a new protocol for which no one has any deployment experience.

Tom

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

Mike-Heard commented 7 months ago

On Wed, Nov 8, 2023 at 7:39 AM Mike-Heard @.***> wrote: "OK, so in that case, I propose: OCS MUST be non-zero when UDP CS !=0 -- but I think the text already has that if UDP CS==0, OCS MAY be zero -- but we put in the caveat that the default is to make it non-zero and users wishing to override that do so at the risk of not having a check on non-option uses of the surplus area. One specific such use case is 6936-style tunnels. I.e. the default does what some want and we caveat the distinction, but let it be the user's choice to take the risk." This much makes sense :-) Modulo wordsmithing, I concur that this is a good way forward. However, I would like to see it made clear that that the implementation should allow the user the choice whether to accept a packet with OCS==0. Mike, The draft states: "Like the UDP checksum, the OCS is optional under certain circumstances and contains zero when not used. UDP checksums can be zero for IPv4 [RFC791] and for IPv6 [RFC8200] when UDP payload already covered by another checksum, as might occur for tunnels [RFC6935]. The same exceptions apply to the OCS when used to detect bit errors;" As I've mentioned before, RFC6935 and RFC6936 provides the guidance and requirements for setting the UDP checksum, there is nothing in those RFCs that discusses the UDP surplus area which not covered by the UDP checksum. A significant point of those RFCs is that if the UDP payload is already covered by another checksum then the risks of not using the UDP checksum are substantially reduce. But if the UDP payload checksum is covered by its own checksum, that has no implications for the OCS: it in no way affects the risk of bit errors or possbility of misinterpretation in the surplus area. So I believe that RFC6935 are RFC6936 are not applicable and saying the same exceptions for setting the UDP checksum to zero apply to OCS is either misleading or incorrect. RFC6935 and RFC6936 address a different problem and even if they were applicable it isn't enough to just refer to them, there would need to a set of requirements for how they are applied to the specific protocol (like in section 6.2 of RFC8086 (GRE/UDP) and section 3.1 of RFC7510 (MPLS/UDP) ) If "the OCS is optional under certain circumstances" then I suggest those certain cricumstances should be clearly articulated, the risks if the user doesn't use the OCS checksum should be highighted, and any potential mitigations to the risks should be mentioned. The structure of that might be similar to RFC6935 and RFC6936, but it needs to be tailored to the specifics of checksum over the surplus area (there is really no need to even reference RFC6935 or RFC6936 in the document). I'm not opposed to letting the user make decisions, however if we allow that then it's incumbent on us to enable them to make informed decision. This is especially true in the case of a new protocol for which no one has any deployment experience. Tom …

Agreed, the risks need to be carefully articulated.

jtouch commented 7 months ago

I’m in a local conference today and tomorrow. On Nov 8, 2023, at 8:28 AM, tompandadev @.> wrote: On Wed, Nov 8, 2023 at 7:39 AM Mike-Heard @.> wrote:

"OK, so in that case, I propose:

OCS MUST be non-zero when UDP CS !=0 -- but I think the text already has that if UDP CS==0, OCS MAY be zero -- but we put in the caveat that the default is to make it non-zero and users wishing to override that do so at the risk of not having a check on non-option uses of the surplus area. One specific such use case is 6936-style tunnels. I.e. the default does what some want and we caveat the distinction, but let it be the user's choice to take the risk."

This much makes sense :-)

Modulo wordsmithing, I concur that this is a good way forward. However, I would like to see it made clear that that the implementation should allow the user the choice whether to accept a packet with OCS==0.

Mike,

The draft states:

"Like the UDP checksum, the OCS is optional under certain

circumstances and contains zero when not used. UDP checksums can be

zero for IPv4 [RFC791] and for IPv6 [RFC8200] when UDP payload already

covered by another checksum, as might occur for tunnels [RFC6935]. The

same exceptions apply to the OCS when used to detect bit errors;"

As I've mentioned before, RFC6935 and RFC6936 provides the guidance

and requirements for setting the UDP checksum, there is nothing in

those RFCs that discusses the UDP surplus area which not covered by

the UDP checksum. A significant point of those RFCs is that if the UDP

payload is already covered by another checksum then the risks of not

using the UDP checksum are substantially reduce. But if the UDP

payload checksum is covered by its own checksum, that has no

implications for the OCS: it in no way affects the risk of bit errors

or possbility of misinterpretation in the surplus area. So I believe

that RFC6935 are RFC6936 are not applicable and saying the same

exceptions for setting the UDP checksum to zero apply to OCS is either

misleading or incorrect. RFC6935 and RFC6936 address a different

problem and even if they were applicable it isn't enough to just refer

to them, there would need to a set of requirements for how they are

applied to the specific protocol (like in section 6.2 of RFC8086

(GRE/UDP) and section 3.1 of RFC7510 (MPLS/UDP) )

If "the OCS is optional under certain circumstances" then I suggest

those certain cricumstances should be clearly articulated, the risks

if the user doesn't use the OCS checksum should be highighted, and any

potential mitigations to the risks should be mentioned. The structure

of that might be similar to RFC6935 and RFC6936, but it needs to be

tailored to the specifics of checksum over the surplus area (there is

really no need to even reference RFC6935 or RFC6936 in the document).

I'm not opposed to letting the user make decisions, however if we

allow that then it's incumbent on us to enable them to make informed

decision. This is especially true in the case of a new protocol for

which no one has any deployment experience.

Tom

—

Reply to this email directly, view it on GitHub, or unsubscribe.

You are receiving this because you commented.Message ID: @.***>

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: @.***>

jtouch commented 7 months ago

I disagree that we need to explicitly note that it is the receiver's decision to drop packets with OCS==0. It's always the receiver's decision whether to accept a packet or not. HOWEVER, OCS==0 is not a clear indication of anything - it doesn't itself mean the packet is "potentially ill formatted" or even that it is likely a legacy undocumented use of the surplus area (if that were the case, then even the fact that the value is exactly zero in the appropriate place provides some minor level of check). I did put a little more text in this section to caveat things in both directions (pro and con). Please take a look at -25 and suggest specific places where needed discussion is missing (contributed text welcome, of course).

Mike-Heard commented 7 months ago

I read the new text in -28, and I see that OCS != 0 is RECOMMENDED, as agreed, and IMO the risks of OCS == 0 are adequately discussed. I believe that it is time to close this issue.

gorryfair commented 7 months ago

Closed, will be confirmed with all decisions in WGLC.