quicwg / multipath

In-progress version of draft-ietf-quic-multipath
Other
49 stars 18 forks source link

Review error codes #253

Closed mirjak closed 6 days ago

mirjak commented 1 year ago

Currently, MP_PROTOCOL_VIOLATION is used is for a too high CID seq num in the frame as well as the handshake error case if multipath is negotiated but no CID is present.

However, we could also decided to define more specific error codes for one or both cases.

kazuho commented 1 year ago

handshake error case if multipath is negotiated but no CID is present

I think we use TRANSPORT_PARAMATER error? See https://quicwg.org/multipath/draft-ietf-quic-multipath.html#section-3-6. FWIW, my understanding is that use of such TRANSPORT_PARAMETER error is in line with us using TRANSPORT_PARAMETER error to indicate that zero CID was used in conjunction with the preferred_address Transport Parameter.

Regarding if we need more errors, I do not think there is a need at the moment. All the use of MP_PROTOCOL_VIOLATION is related to semantic errors in the newly added frames, we have Frame Type and Reason Phrase in the CONNECTION_CLOSE frame to communicate the problem in detail.

Honestly, I would not mind getting rid of MP_PROTOCOL_VIOLATION altogether considering that i) we have the Frame Type field to communicate which frame had a problem, and ii) for errors in the Connection ID sequence numbers being communicated inside RETIRE_CONNECTION_ID frame we use PROTOCOL_VIOLATION.

mirjak commented 1 year ago

We changed the handshake error to MP_PROTOCOL_VIOLATION in the editors copy yesterday based on the discussion at 116 as recorded in issue #157.

However, I think we need a more high-level discussion about all error codes, there I opened this issue for more discussion at 117.

kazuho commented 1 year ago

@mirjak Oh I missed that, thank you.

Regarding if we should have more error codes, my two cents goes to not, as we can use the Frame Type field of the CONNECTION_CLOSE frame to figure out the cause.

LPardue commented 1 year ago

Discussed in IETF 117, single MP-specific error code seems sufficient

huitema commented 1 year ago

@LPardue Actually, there was s slightly different conclusion. The current error happens during the handshake, possibly before the MP option was negotiated. Implementation that are not multipath capable will just send "PROTOCOL_VIOLATION" and document the frame type. That might be good enough for us too, and we may be able to get rid of the multipath specific error code.

mirjak commented 1 year ago

There are currently 4 case where we send an error code:

  1. multipath is negotiated but there is no CID(s) -> MP_PROTOCOL_VIOLATION
  2. Cipher suites with too short nonce -> TRANSPORT_PARAMETER
  3. multipath frame within wrong packet type -> FRAME_ENCODING_ERROR
  4. seq number of CID (used in a frame) was is not valid -> MP_PROTOCOL_VIOLATION

@Lucas which error are you talking about?

mirjak commented 10 months ago

@LPardue @huitema not sure what we need to do here. Any more input?

huitema commented 10 months ago

I don't think we need to change the spec now.

mirjak commented 10 months ago

Can we close this issue? Are we happy with the current use of error codes?

marten-seemann commented 10 months ago

What's the justification for having a dedicated MP_PROTOCOL_VIOLATION error? All other QUIC extensions so far have been using the standard PROTOCOL_VIOLATION error, instead of defining their own bespoke version.

LPardue commented 10 months ago

My take on the design philosophy of error codes in QUIC was a) an enpoint can always pick the most generic error code b) picking a more-specific error code is a kindness/affordance to the peer. An error code might help the receiver determine more quickly where a problem lies - but quite often these bugs are rare and easy to determine a root cause as long as awareness of any issue is raised (e.g. I got a CONNECTION_CLOSE when hitting a server, and here is the repro URL)

This stuff is a bit subjective.

I see 2 cases in the draft where MP_PROTOCOL_VIOLATION is used

1) If an enable_multipath transport parameter is received and the carrying packet contains a zero length connection ID, the receiver MUST treat this as a connection error of type MP_PROTOCOL_VIOLATION and close the connection.

This is super easy to spot - its a critical error exceedingly early in a connection and its clear why the sender triggered it

2) All multipath-specific frames relate to a Destination Connection ID sequence number. If an endpoint receives a Destination Connection ID sequence number greater than any previously sent to the peer, it MUST treat this as a connection error of type MP_PROTOCOL_VIOLATION.

This one doesn't seem as easy as the first. However, the endpoint that receives the bogus frame can just indicate the frame type in the CONNECTION_CLOSE. It might also say something in the reason phrase.

On balance, it feels to me that dropping MP_PROTOCOL_VIOLATION would not make anyone's lives too hard, while saving the overhead of defining and implementing the error code.

mirjak commented 3 months ago

Having reviewed the document now again, I also think we should drop MP_PROTOCOL_VIOLATION and use PROTOCOL_VIOLATION instead.

If we really want a more specific error code for the zero-cid case, I think we should define an error code for that specifically. Additionally, we could also define a path_id_limit error (now with the explicit path id) if people think that would be useful...?

mirjak commented 1 week ago

As discussed at IETF-120, PR #428 removes the MP_PROTOCOL_VIOLATION error code.