Open rohanmahy opened 2 months ago
I tend to prefer a single ABNF, and clear guidance and examples for extensions.
I don't fully follow the comments regarding extensions to support comments, is the suggestion that if a single ABNF is provided then it must not define support for comments and leave that to extension? I don't like that.
If comments are just used as an example here, can we use a new app specific literal as an example instead?
I tend to prefer a single ABNF, and clear guidance and examples for extensions.
Me too. I'm sorry that I didn't provide more text about what that guidance would be. Basically, someone writing a new app-string, writes the grammar using as many of the productions in the base/existing ABNF as possible. If they allow whitespace and want both slash and hash comment, insert the S
production (see the example below for e). If they want hash comments only (because / is meaningful inside), insert the B
production (see the example below for ip).
Regardless if comments are used inside the app-string, the new app-string definition is added to the end of the known-app-str
production.
That's it.
I don't fully follow the comments regarding extensions to support comments, is the suggestion that if a single ABNF is provided then it must not define support for comments and leave that to extension? I don't like that.
Comments are supposed to be treated like whitespace in EDN. Without this PR, the four app strings defined in this draft have the following characteristics:
While I see no reason to add comments inside an ip address, If you were willing to make the application ignorine all whitespace and wanted to extend ip to support # comments, you could add the B
production into the lowest levels of the BNF (h16, colon, double colon, decoctet, uint/prefix). for example:
h16 = 1*4(HEXDIG / B)
Including https://www.ietf.org/archive/id/draft-ietf-cbor-edn-e-ref-00.html , there is no formal grammar defined, but the intent of e'' seems to be to allow anything that is valid as the id
production in CDDL; and the intent of ref'' seems to be to allow anything compatible with the generic URI syntax and literal filenames. Literal filenames don't have fungible whitespace, often contain slashes and could contain arbitrary characters such as #. URIs don't allow any whitespace and the / and # characters are used in their BNFs. The id
production allows [A-Za-z0-9@_$.-]. If the author wanted to allow comments and whitespace anywhere in e'', he could add the S
production inside the BNF of enum:
; app-string e
e = (%s"e" / %s"E") SQUOTE enum SQUOTE
enum = S EALPHA *(*("-" / ".") (EALPHA / DIGIT/ S))
EALPHA = ALPHA / "@" / "_" / "$"
Personally, I think the fact that the current grammars for ip and dt don't allow comments, and none of the examples for e or ref have comments is a strong indicator that nobody is going to bother incorporating comments in future app strings, largely because every new app string so far doesn't allow whitespace.
If comments are just used as an example here, can we use a new app specific literal as an example instead?
I'm not sure what you mean @OR13 . Hopefully I answered your questions somewhere else in this reply.
I meant that as the working group compares the approaches, an example that is complex enough to highlight differences might be valuable.
Also, all these comments are as an individual. I'd like to hear from other implementers,.
This currently has the ellipsis
rule defined twice, and the inside-h rule has "ellipsis"
instead of ellipsis
.
This currently has the
ellipsis
rule defined twice, and the inside-h rule has"ellipsis"
instead ofellipsis
.
Thanks Joe!
OK, here are some more issues then:
uint
defined more than onceS
defined more than onceALPHA *(ALPHA / DIGIT / "-")
, which means b64_4 (etc) are invalid.SQOUTE
is a typob64dig
is not definedALPHA
is not definedWith these changes, I was able to at least parse this.
OK, here are some more issues then:
* `uint` defined more than once * `S` defined more than once * Rule names according to RFC 5234 are `ALPHA *(ALPHA / DIGIT / "-")`, which means b64_4 (etc) are invalid. * There are a bunch of \xa0 non-breaking spaces scattered around, I bet unintentionally * `SQOUTE` is a typo * `b64dig` is not defined * `ALPHA` is not defined
With these changes, I was able to at least parse this.
OK, all fixed now. Not sure how the \xA0 got in there. :-\
There's also going to need a bunch of rework in the comment rules used in inside-*. None of them should accept an SQUOTE, I think. h'12 /'/ 34'
shouldn't work, for example.
Note that 'one-item' is an alternate entry point, so it's not unused.
There's also going to need a bunch of rework in the comment rules used in inside-*. None of them should accept an SQUOTE, I think.
h'12 /'/ 34'
shouldn't work, for example.
By my read, it does not allow /'/, since safe-unescaped specifically excludes <'> unless it is backslash escaped (see slash-escaped
and hash-escaped
). So h'12 / wasn\'t a problem / 34'
is OK.
As an existence proof that extending a single-layer ABNF is straightforward, edn-e-ref issue #1 shows an ABNF for e'' and ref'' that works with the ABNF in this PR. The changes specifically needed to make this work in a single layer were:
known-app-str =/ e / ref
sub-delims
(a five character change)The bulk of the work was fixing removing duplicate or unused productions.
Hi, I believe this PR contains the changes that would be needed to make a single ABNF for EDN. A future app-string that had no requirement for interior comments could just define its ABNF (with any unquoted characters from unescaped and DQUOTE), and append the app-string to the
known-app-str
production. A future app-string that wanted comments could add safely either slash-comment or hash-comment to its whitespace as long as "/" or "#" respectively did not have meaning in the interior grammar.Hope this helps. Thanks, -rohan