ietf-rats-wg / eat

Entity Attestation Token IETF Draft Standard
Other
18 stars 15 forks source link

Improve JSON nonce min and max size #421

Closed laurencelundblade closed 10 months ago

laurencelundblade commented 11 months ago

A private comment from Thomas (thx) caused me to think through the size of nonces for JSON-encoded tokens again. The new text is more obvious and gives more flexibility.

The main point of restricting a size range of the nonce is to help developers. The max size makes it clear what size buffer they need to allocate. The min size makes sure they are at least trying to get enough entropy.

carl-wallace commented 11 months ago

Who determines if base64 encoding is used for a JSON nonce, the nonce generator/verifier or the attester? Would it be more clear if this were always base64 encoded for JSON?

laurencelundblade commented 11 months ago

This is just a random value, not an encoding of any data. There is no need to reverse the text encoding here unlike other uses of base 64. The receiver just uses the text data as the nonce directly.

This allows the sender to use the full range of bytes and bits in UTF-8 to make the random value as small as possible if they want.

Maybe we need more text here to say this? Particularly, that the JSON receiver MUST use the UTF-8 text as random bytes directly.

carl-wallace commented 11 months ago

I think that'd be a good addition.

setrofim commented 11 months ago

There is no need to reverse the text encoding here

The encoding ought to be reversed so that it matches the CBOR resporsentation. Insisting that UTF-8 is used directly as a random value makes certain random values impossible to represent in JSON (e.g. those that start with 0x80 -- a continuation byte).

laurencelundblade commented 11 months ago

There is no need to reverse the text encoding here

The encoding ought to be reversed so that it matches the CBOR resporsentation. Insisting that UTF-8 is used directly as a random value makes certain random values impossible to represent in JSON (e.g. those that start with 0x80 -- a continuation byte).

I've added an update that explains why the B64 encoding must not be remove/reversed.

Note also that this there is no need to translate this claim to/from CBOR.

setrofim commented 11 months ago

There is no need to reverse the text encoding here

The encoding ought to be reversed so that it matches the CBOR resporsentation. Insisting that UTF-8 is used directly as a random value makes certain random values impossible to represent in JSON (e.g. those that start with 0x80 -- a continuation byte).

I've added an update that explains why the B64 encoding must not be remove/reversed.

Note also that this there is no need to translate this claim to/from CBOR.

JSON and CBOR are both just encodings of some conceptual structure. If nonce is a sequence of arbitrary bytes, and is represent as a UTF-8 string in JSON, then it cannot be written directly and must be encoded (with base64 probably being the most common encoding, followed by hex); when parsing JSON, then nonce must be then decoded into its original form. If it is used as-is then the nonce value will be different from the original. Conversely, if the nonce is defined as a UTF-8 string, rather than an arbitrary sequence of bytes, then the same restriction must be applied to the CBOR encoding (i.e. should be encoded as a text string rather than byte string).

A particular implementation may need to work with both CBOR and JSON encodings (e.g. receiving external input as CBOR, but using JSON internally for RPC, logging, persisting state, etc.). The values of EAT fields should be stable across conversions CBOR<->internal<->JSON.

For example, a CBOR-encoded EAT containing nonce value {0x80, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07} is read by an application. The application then persists it to disk as JSON. Then nonce value is not a valid UTF-8 byte sequence, so cannot be written to JSON as-is, and must be encoded. If base64 encoding is used, then it is written as "gAECAwQFBgc=". At a later point, the application restores the previously persisted state (or some other applications reads the JSON). If the nonce is not decoded when it is read, then it is now {0x67, 0x41, 0x45, 0x43, 0x41, 0x77, 0x51, 0x46, 0x42, 0x67, 0x63, 0x3d}, which is not the original value.

laurencelundblade commented 11 months ago

Hi Setrofim,

The EAT nonce for JSON is like the OpenID nonce (https://openid.net/specs/openid-connect-core-1_0.html#IDToken). The text string is used directly as the nonce.

A nonce is just some bytes that are compared for equality. It doesn’t matter if they are text or binary when comparing for equality.

It is true that we could change the JSON EAT nonce to be a byte string that is sent as a bstr in CBOR and a b64-encoded text string in JSON. This is exactly how ueid, oemid and other work in EAT. This also gives translatability between CBOR and JSON.

The very big reason for not doing this now is because it would break backwards compatibility with all of EAT for the last four years of development. B64 encoding would be a MUST. Implementations of any EAT draft up to now would not interoperate.

The other reason for not doing this is because the OpenID nonce doesn’t do it.

While translatability between CBOR and JSON claims is useful and does work for most EAT claims, the nonce is one claim where it isn’t critical.

In the latest update to the PR, we’ve removed mention of b64 encoding because we think it is confusing and because OpenID doesn’t use it. The length still allows for b64 encoding as a convenient technique to turn random binary bytes into text bytes.

setrofim commented 11 months ago

A nonce is just some bytes that are compared for equality. It doesn’t matter if they are text or binary when comparing for equality.

Sure. I'm not arguing for nonce being a binary byte sequence. I'm arguing for "EAT nonce" definition being independent of a specific serialization.

While translatability between CBOR and JSON claims is useful and does work for most EAT claims, the nonce is one claim where it isn’t critical.

Translatability of individual claims is one thing. However, there is either a single "EAT" format representable as both JSON and CBOR, or there are two incompatible "JSON EAT" and "CBOR EAT" formats. I would argue the former would be more useful, but from your reply I understand the intent is for the latter?

In the latest update to the PR, we’ve removed mention of b64 encoding because we think it is confusing and because OpenID doesn’t use it. The length still allows for b64 encoding as a convenient technique to turn random binary bytes into text bytes.

Cool. As long as the text does not restrict how a nonce claim is to be interpreted, implementations that needs to work with the same tokens as both CBOR and JSON can at least do so (even if the two serializations of the same token would not necessarily be identifiable as such outside of those implementations, since the encoding is not standard).

laurencelundblade commented 11 months ago

A nonce is just some bytes that are compared for equality. It doesn’t matter if they are text or binary when comparing for equality.

Sure. I'm not arguing for nonce being a binary byte sequence. I'm arguing for "EAT nonce" definition being independent of a specific serialization.

While translatability between CBOR and JSON claims is useful and does work for most EAT claims, the nonce is one claim where it isn’t critical.

Translatability of individual claims is one thing. However, there is either a single "EAT" format representable as both JSON and CBOR, or there are two incompatible "JSON EAT" and "CBOR EAT" formats. I would argue the former would be more useful, but from your reply I understand the intent is for the latter?

There's a single EAT data model that can be serialized as JSON or CBOR. That is true even if the nonce is slightly different and can't be translated directly.

Pretty much everything else can translated between JSON and CBOR deterministically, so it is pretty much what you call a "single EAT format".

If this were a two years ago before all the last calls and IESG review and this were not an incompatible protocol change, I'd be much more open to a change.

In the latest update to the PR, we’ve removed mention of b64 encoding because we think it is confusing and because OpenID doesn’t use it. The length still allows for b64 encoding as a convenient technique to turn random binary bytes into text bytes.

Cool. As long as the text does not restrict how a nonce claim is to be interpreted, implementations that needs to work with the same tokens as both CBOR and JSON can at least do so (even if the two serializations of the same token would not necessarily be identifiable as such outside of those implementations, since the encoding is not standard).

It kind of does restrict it. If someone sends you a nonce that isn't base64 encoded, the decode will fail. Your implementation will be incorrect.

You could write an EAT profile that requires the nonce to be b64 encoded if you must. So you do have an out if you really need translatability, but the profile would not be generally interoperable with those not following the profile.

Can you describe a detailed, concrete and compelling situation where the nonce needs to be translated?

Note that you can't translate a JWT to CWT without re signing. It's not something you just happen to do. Similarly, the nonce is not like other claims. It's in a section by itself for this reason. It generally doesn't flow through.

We've been working on EAT for 5 years. Translatability between JSON and CBOR of any sort has never come up. This tells me it is not so important to make an incompatible protocol change this late in the work.

setrofim commented 11 months ago

There's a single EAT data model that can be serialized as JSON or CBOR.

If thas is the initent....

[...] the nonce is slightly different and can't be translated directly.

...then this is a bug.

If there is a single data model, what is the definition of the nonce within that data model? If it is arbitrary bytes, then the JSON serialization is bugged; if it is UTF-8, then CBOR serialization is bugged. If it's bytes for CBOR and UTF-8 for JSON, then there are two models.

Pretty much everything else can translated between JSON and CBOR deterministically, so it is pretty much what you call a "single EAT format".

If I can almost write the same token as both formats, save for one claim, then I can't do it.

Can you describe a detailed, concrete and compelling situation where the nonce needs to be translated?

EAT Attestation Result and PSA attestation token profile are both EAT-based profiles that utilize the nonce claim. The former specifies both JSON and CBOR representations, the latter is CBOR only.

Veraison verifier establishes a nonce with the client, which is then used to generate the PSA token, and eventually gets written inside EAR as well. The verifier needs to able to compare the established nonce with the one it extracts from the attestation token. Further, the relying part later should be able to check that the nonce in the EAR (JSON) matches the nonce in the PSA token it got from the attester (CBOR).

It kind of does restrict it. If someone sends you a nonce that isn't base64 encoded, the decode will fail. Your implementation will be incorrect.

Well, the implementation would need to be anticipate that case; but nothing to stop it from rejecting tokens that don't contain b64 nonces. As long as the spec does not mandate specific handling thereof (which, as of the latest update, it no longer does), the implementation is not incorrect with respect to the spec. This is no different than rejecting tokens whose nonce didn't match the expected value.

If this were a two years ago before all the last calls and IESG review and this were not an incompatible protocol change, I'd be much more open to a change.

We've been working on EAT for 5 years. Translatability between JSON and CBOR of any sort has never come up. This tells me it is not so important to make an incompatible protocol change this late in the work.

Sure, I get that I'm pretty late to the party here. And now that the language that the "UTF-8 string MUST be used directly as nonce" has been removed, I think things are fine as they are. Though, it might be worth at least adding explicit language somewhere that JSON EAT and CBOR EAT representations are not intended to be compatible.

You could write an EAT profile that requires the nonce to be b64 encoded if you must. So you do have an out if you really need translatability, but the profile would not be generally interoperable with those not following the profile.

Yup, I guess this will have to be the workround. Specify b64 as part of EAR instead.

laurencelundblade commented 10 months ago

There's a single EAT data model that can be serialized as JSON or CBOR.

If thas is the initent....

[...] the nonce is slightly different and can't be translated directly.

...then this is a bug.

If there is a single data model, what is the definition of the nonce within that data model? If it is arbitrary bytes, then the JSON serialization is bugged; if it is UTF-8, then CBOR serialization is bugged. If it's bytes for CBOR and UTF-8 for JSON, then there are two models.

Pretty much everything else can translated between JSON and CBOR deterministically, so it is pretty much what you call a "single EAT format".

If I can almost write the same token as both formats, save for one claim, then I can't do it.

Carsten uses the term data model. I'm not sure if the official definition of it includes translatability between encodings or not. I kind of assumed not, but maybe it should.

EAT is one of the first and most thorough attempts at specifying CBOR and JSON simultaneously. We kind of got Carsten to help out too. A lot people said we shouldn't even try. Some still suggesting taking JSON out now.

Probably this is a lesson for the CBOR-JSON world. That we should make translatability a goal of all CBOR-JSON work. Would be interesting what the CBOR mailing list would say to this.

There are other issues with CBOR-JSON that we didn't have to deal with here, like the simple types. We had to work around the CBOR tag issues.

Can you describe a detailed, concrete and compelling situation where the nonce needs to be translated?

EAT Attestation Result and PSA attestation token profile are both EAT-based profiles that utilize the nonce claim. The former specifies both JSON and CBOR representations, the latter is CBOR only.

Veraison verifier establishes a nonce with the client, which is then used to generate the PSA token, and eventually gets written inside EAR as well. The verifier needs to able to compare the established nonce with the one it extracts from the attestation token. Further, the relying part later should be able to check that the nonce in the EAR (JSON) matches the nonce in the PSA token it got from the attester (CBOR).

I didn't follow the flow fully, but it doesn't sound like you have a use case where the same nonce must be represented in both CBOR and JSON.

It kind of does restrict it. If someone sends you a nonce that isn't base64 encoded, the decode will fail. Your implementation will be incorrect.

Well, the implementation would need to be anticipate that case; but nothing to stop it from rejecting tokens that don't contain b64 nonces. As long as the spec does not mandate specific handling thereof (which, as of the latest update, it no longer does), the implementation is not incorrect with respect to the spec. This is no different than rejecting tokens whose nonce didn't match the expected value.

If this were a two years ago before all the last calls and IESG review and this were not an incompatible protocol change, I'd be much more open to a change.

We've been working on EAT for 5 years. Translatability between JSON and CBOR of any sort has never come up. This tells me it is not so important to make an incompatible protocol change this late in the work.

Sure, I get that I'm pretty late to the party here. And now that the language that the "UTF-8 string MUST be used directly as nonce" has been removed, I think things are fine as they are. Though, it might be worth at least adding explicit language somewhere that JSON EAT and CBOR EAT representations are not intended to be compatible.

You could write an EAT profile that requires the nonce to be b64 encoded if you must. So you do have an out if you really need translatability, but the profile would not be generally interoperable with those not following the profile.

Yup, I guess this will have to be the workround. Specify b64 as part of EAR instead.

OK. Plan to merge this as is soon.

Will also comment on EAR.