quicwg / qlog

The IETF I-D documents for the qlog format
Other
77 stars 12 forks source link

`ConnectionClose` storing a uint64 error_code and a uint64 error_code_value seems redundant #391

Closed evanrittenhouse closed 4 months ago

evanrittenhouse commented 4 months ago

As the title states, I don't quite see a reason for ConnectionClose to have both error_code and error_code_value store uint64s. It seems like it may be clearer if either:

  1. error_code is some non-uint64 value
  2. error_code is a uint64, but error_code_value drops out of the event.
LPardue commented 4 months ago

So the current definition is

ErrorSpace = "transport" /
             "application"

ConnectionCloseFrame = {
    frame_type: "connection_close"
    ? error_space: ErrorSpace
    ? error_code: TransportError /
                  $ApplicationError /
                  uint64
    ? error_code_value: uint64
    ? reason: text

    ; when error_space === "transport"
    ? trigger_frame_type: uint64 /
                          text
}

TransportError and ApplicationError are "strongly typed" strings. An endpoint that receives an unknown code should be able to log the raw number as a uint64. The big question is why do error_code and error_code_value both allow that? I think we should just pick one way.

rmarx commented 4 months ago

So, IIRC, the original intent here was to log all necessary details for the CRYPTO_ERROR error space (as also indicated in the prose preceeding the CDDL in the doc: "The error_code_value field is the numerical value without variable-length integer encoding. This is useful because some error types are spread out over a range of codes (e.g., QUIC's crypto_error).")

Why? Because in RFC9000, CRYPTO_ERROR is a range of errors:

CRYPTO_ERROR (0x0100-0x01ff):
The cryptographic handshake failed. A range of 256 values is reserved for carrying error codes specific to the cryptographic handshake that is used. Codes for errors occurring when TLS is used for the cryptographic handshake are described in [Section 4.8](https://www.rfc-editor.org/rfc/rfc9001#section-4.8) of [[QUIC-TLS](https://www.rfc-editor.org/rfc/rfc9000.html#QUIC-TLS)].

As such, originally in qlog, you'd for example have error_code: "crypto_error" and error_code_value: 0x0111 so you can have both the high-level error_code string AND the specific value of the 256 range.

HOWEVER I changed this down the line to have the separate CryptoError struct as part of TransportError that says this:

; all strings from "crypto_error_0x100" to "crypto_error_0x1ff"
CryptoError = text .regexp "crypto_error_0x1[0-9a-f][0-9a-f]"

And I probably just forgot to remove error_code_value. So, if we agree that the latter approach is a good one, we can indeed leave out error_code_value :)

However, I do remember this having some weirdness with proper CDDL (i.e., can't merge both an enum (TransportError) and a separate type like CryptoError, which is why it's indicated purely as prose in the current docs).

So I think the final solution depends on how strictly we want to adhere to CDDL :)

LPardue commented 4 months ago

can we do this (and update the prose accordingly)?

ConnectionCloseFrame = {
    frame_type: "connection_close"
    ? error_space: ErrorSpace
    ? error_code: TransportError /
                  CryptoError /
                  $ApplicationError /
                  uint64
    ? reason: text

    ; when error_space === "transport"
    ? trigger_frame_type: uint64 /
                          text
}
rmarx commented 4 months ago

Without running the CDDL validator, I'm guessing yes :) That would be the most practical solution I think!