cbor / cbor.github.io

cbor.io web site
74 stars 33 forks source link

spec: ambiguity with tags before `ff` breaks #65

Closed nigeltao closed 3 years ago

nigeltao commented 3 years ago

Putting 9f c0 ff into http://cbor.me/ diagnostic tool says "break stop code outside indefinite length item", but that error message doesn't sound right: the 9f is an indefinite length item. The problem is that the c0 tag should precede a data item and the ff break stop code isn't a data item.

That's assuming that it's actually an error to precede ff with tags. I don't think the spec explicitly says this, other than section 2.4's "a data item can optionally be preceded by a tag". It doesn't explicitly reject the converse: tags that don't precede a data item.

Tangentially, that quote ends with "a tag" not "tags" plural, and tags are also not a data item, so according to a literal intepretation of rejecting the converse, the c1 in c1 c2 03 is invalid. On the other hand, section 2.4 goes on to say "if tag A is followed by tag B, which is followed by data item C, tag A applies to the result of applying tag B on data item C", so it sounds like tags (plural) is valid. The test-vectors don't cover this but http://cbor.me/ accepts c1 c2 03. Is it that, when spec language lawyering, c2 03 combined should be considered a data item, which is preceded by c1?

cabo commented 3 years ago

Hi Nigel,

On 2020-07-24, at 04:38, Nigel Tao notifications@github.com wrote:

Putting 9f c0 ff into http://cbor.me/ diagnostic tool says "break stop code outside indefinite length item", but that error message doesn't sound right: the 9f is an indefinite length item.

Yes, but you are in the tag after c0, and there are no indefinite length tags — a tag must have exactly one data item.

The problem is that the c0 tag should precede a data item and the ff break stop code isn't a data item.

Yes. This is clearly disambiguated by Appendix C.

I would encourage you to have a look at the revised edition of RFC 7049 as well, which can be found at

https://www.ietf.org/id/draft-ietf-cbor-7049bis-14.html

This is currently in IETF last-call, so most of the work we anticipated has been done, but it isn’t an RFC yet.

That's assuming that it's actually an error to precede ff with tags. I don't think the spec explicitly says this, other than section 2.4's "a data item can optionally be preceded by a tag". It doesn't explicitly reject the converse: tags that don't precede a data item.

(Citing 7049bis now): Right in Section 2, it says that a tag has a tag number and tag content (“a data item”).

Tangentially, that quote ends with "a tag" not "tags" plural, and tags are also not a data item,

They certainly are. We did clarify the terminology around tags in 7049bis, so maybe you should have a look there.

so according to a literal intepretation of rejecting the converse, the c1 in c1 c2 03 is invalid. On the other hand, section 2.4 goes on to say "if tag A is followed by tag B, which is followed by data item C, tag A applies to the result of applying tag B on data item C", so it sounds like tags (plural) is valid.

Yes.

The test-vectors don't cover this but http://cbor.me/ accepts c1 c2 03.

[This is well-formed, but not valid: Tag 2 requires a byte string.]

Is it that, when spec language lawyering, c2 03 combined should be considered a data item, which is preceded by c1?

Yes. We now clarify this as saying the tag C2 03 is “enclosed” by the tag numbered 1 (indicated by the C1). cbor.me provides this diagnostic notation:

1(2(3))

(Where the 1 and the 2 are tag numbers, and 3 is an integer.)

Grüße, Carsten

cabo commented 3 years ago

RFC 8949 has been published. Closing this now.