cbor-wg / edn-literal

Application-oriented literals for CBOR extended diagnostic notation
Other
0 stars 7 forks source link

Add text about later stages of ingestion, failing on 999 #55

Closed cabo closed 2 months ago

cabo commented 2 months ago

(Linda Dunbar, Opsdir review)

rohanmahy commented 2 months ago

The paragraph before the added text says:

"The content of this tag is an array of two text strings: The application-extension identifier, and the (escape-processed) content of the single-quoted string."

1) It would be worth mentioning here that the identifier is case sensitive.

2) As both the single-quoted string and double-quoted strings need to be escaped for mostly the same characters, I would instead say:

"The content of this tag is an array of two text strings. The first is the case-sensitive application-extension identifier. The second is the contents of the single-quoted string, with any escaped single quote (ASCII 0x5c 0x27) replaced with a single quote (ASCII 0x27), and any unescaped double-quote (ASCII 0x22) replaced with backslash double-quote (ASCII 0x5c 0x22)."

cabo commented 2 months ago

There already is text about the case of application-extension identifiers; we don't need to repeat this here. I also don't think we benefit from restating the handling of strings in an abbreviated, ultimately incorrect way; simply saying that the escape processing applies here as well should suffice.

rohanmahy commented 2 months ago

There already is text about the case of application-extension identifiers; we don't need to repeat this here. I also don't think we benefit from restating the handling of strings in an abbreviated, ultimately incorrect way; simply saying that the escape processing applies here as well should suffice.

Saying that the contents is "the (escape-processed) content of the single-quoted string." is also wrong and likely to be misinterpreted. It is the (double-quote) escape-processed content of the (single-quote) unescaped single-quoted string.

cabo commented 2 months ago

Saying that the contents is "the (escape-processed) content of the single-quoted string." is also wrong and likely to be misinterpreted. It is the (double-quote) escape-processed content of the (single-quote) unescaped single-quoted string.

Can you elaborate? I don't follow. How do double quotes enter the picture?

rohanmahy commented 2 months ago

Saying that the contents is "the (escape-processed) content of the single-quoted string." is also wrong and likely to be misinterpreted. It is the (double-quote) escape-processed content of the (single-quote) unescaped single-quoted string.

Can you elaborate? I don't follow. How do double quotes enter the picture?

If I understand the intent, foo'don\'t "worry"\n \\ be happy' needs to become 999("foo", "don't \"worry\"\n \\ be happy"). the string inside the single-quotes needs to be unescaped, then needs to be re-escaped as a the contents of a double-quoted string. The original text could mean double escaping to 999("foo", "don\'t \"worry\"\n \\ be happy") which is not wanted.

cabo commented 2 months ago

The text talks about the data model level. Of course, when you represent these as EDN, you'll need to use the right representation for the text strings found. If you represent them as CBOR, you don't need to do any escaping.