cbor-wg / CBORbis

The draft leading up to RFC8949
https://www.rfc-editor.org/rfc/rfc8949.html
Other
5 stars 12 forks source link

Section 5.7 - mention it's OK to encode untagged CBOR Null or untagged CBOR Undef when protocol requires tag number but content cannot be tagged #170

Closed x448 closed 4 years ago

x448 commented 4 years ago

This would help in scenarios where app/protocol specifies tag numbers 0 and 1 MUST be encoded but then provides an uninitialized/unspecified time value.

Unspecified/uninitialized time value creates an encoding problem because there's no standard value to represent uninitialized time that is valid for tag numbers 0 and 1.

7049bis section 5.7 Undefined Values suggests using CBOR Undefined when encoding problems occur. However, it still fails to satisfy the protocols requirement for always encoding tag number 0 and 1 which could lead to compatibility problems.

Thanks for considering this.

mcr commented 4 years ago

What is your use case? We wonder if it about certificate-like "expiresOn", which you might want to be infinite, which in ASN.1 space, we do with "99991231", which is stupid. Couldn't you just leave out the value complete? or use an untagged CBOR Null?

cabo commented 4 years ago

Today's solutions include:

Changing tags 0 and 1 to allow data other than numbers is perceived as a relatively large change.

cabo commented 4 years ago

(If you wonder about the flurry of responses, we just discussed this issue in the CBOR interim. Next meeting in two weeks; please join the CBOR mailing list or look at the archive listed there to get the details.)

laurencelundblade commented 4 years ago

The flurry continues...

A simple example has three optional items that are trigged types: a date, a URL and a CWT. Each item can be either present, absent or unspecified, where unspecified is indicated by NULL. This protocol can’t be implemented today because the content of a tagged date, URL and CWT tag can’t be NULL.

One solution is to use a map instead of an array which makes the protocol 3 bytes bigger with use of integer tags. Another is to use undef to indicate absence, also three bytes bigger, but only for the case where items are absent. Also, strings can be zero length, which may be a useful way to indicate absence.

If this were to be adopted it should really should be for all tags, not just 0 and 1. It is a somewhat incompatible change from 7049, which has been avoided so far in CBORbis.

Perhaps not perfect, but it seems there are inexpensive ways to solve the problem.

x448 commented 4 years ago

UPDATED: to say "tag number" instead of tags, etc. and tried to be more specific.

@mcr the use-case is for a generic decoder that was told it MUST encode time tags but is then given undefined values which cannot be tagged. This is considered an encoding error, is it not?

7049bis section 5.7 Undefined Values suggests using CBOR Undefined when encoding problems occur. But CBOR Undefined cannot be tagged as time content, so encoding it without the tag number is an encoding error too (if the protocol/app explicitly required time tags.)

Around the same time I opened this issue, I took @laurencelundblade's advice/warning about iterop with JSON and recommended to @fxamacker that her generic CBOR encoder should just output untagged CBOR Null in this scenario -- she released v2.x at least twice since that week and it would be a breaking change (for default handling of this scenario) to do it differently.

I left the issue open so a discussion like this could lead to guidance in 7049bis for such encoding errors in general (it doesn't have to be about time tags specifically but it's an easy example).

One possibly is for 7049bis to mention (possibly in section 5.7) to say something like "if a protocol specifies tag numbers MUST be encoded but the provided content cannot be tagged due to type mismatch or other reason, then a generic encoder might output untagged CBOR Null or untagged CBOR Undefined or ___" I trust @cabo will have much better wording for this if this isn't a stupid idea I'm proposing.

In other words, just a sentence or two in 7049bis saying it's not forbidden (and perhaps even reasonable) for generic encoders to output untagged CBOR Null in this scenario would be very much appreciated.

jimsch commented 4 years ago

On the first note - if you tell an encode to encode data which is not valid, then I would say that this is an error in the data to be encoded and not an encoding error. The difference being, if I I encode a field as FIELD = date / null, then if a "missing" value is present it would be encoded as a null. However if you passed in a mime message as the date then the encoder should fail as the data type in the model is incorrect.

x448 commented 4 years ago

@jimsch I see. Thanks for that. Also, thank you for your work on COSE.

x448 commented 4 years ago

Thanks for adding the text to resolve this in PR #174.