netmod-wg / yang-next

Feature requests for future versions of YANG
6 stars 0 forks source link

Provide a correct ABNF for Yang strings #6

Open daleworley opened 8 years ago

daleworley commented 8 years ago

(This is derived from a mailing list message, http://www.ietf.org/mail-archive/web/netmod/current/msg16136.html.)

The issue that concerns me is that the ABNF doesn't specify what is allowed as a string. I'm used to programming language definitions, where the grammar is specified quite rigidly, to the point that the ABNF can be input to a parser generator. In this document, the ABNF is quite complete except for a specification of strings. On the other hand, the text description of strings seems to be sufficient for an implementer, so we don't actually need to provide ABNF. My strong preference is to provide a complete ABNF, as is the norm for programming languages.

The following is a complete ABNF for Yang strings. Of course, it's a bit complicated, because the definition of strings in Yang actually is a bit complicated.

[The following text assumes a fixed-width font.] string = unquoted-string / quoted-string

unquoted-string = unquoted-item ( unquoted-item / "/" / "" "" ) ;; a sequence of one or more characters from ;; (ordinary-char / "/" / ""), not containing ;; "//", "/", or "*/"

unquoted-item = ordinary-char / "/" ordinary-char / "" *"" ordinary-char

ordinary-char = < any character matching yang-char, except > < space, tab, newline, carriage return, > < semicolon, left brace, right brace, > < slash, and asterisk >

quoted-string = ( single-quoted-string / double-quoted-string ) *( optsep "+" optsep ( single-quoted-string / double-quoted-string ) )

(IIRC, there can be whitespace (including newlines) around + but not comments.)

single-quoted-string = SQUOTE *sq-char SQUOTE

sq-char = < any character matching yang-char, except > < SQUOTE >

double-quoted-string = DQUOTE *dq-item DQUOTE

dq-item = dq-char / "\n" / "\t" / "\" DQUOTE / "\"

dq-char = < any character matching yang-char, except > < DQUOTE and backslash >

(The existing production for yang-string is removed.)

;; any Unicode or ISO/IEC 10646 character including tab, carriage ;; return, and line feed, but excluding the other C0 control ;; characters, the surrogate blocks, and the noncharacters. yang-char = %x09 / %x0A / %x0D / %x20-D7FF / [continuing as before]

rgwilton commented 7 years ago

I find the ABNF in the YANG RFC to not be as simple or useful as it could be.

When writing a YANG parser, I found that there was really two phases: (1) Check that source file is syntactically correct. The structure of YANG means that tokenizing the file and checking that the syntax is valid is mostly straight forward, and can even be written in a hard coded way. (2) The second phase is a grammar check, which I perform after a syntactically tree has been generated.

I think that it would also be more useful to have an ABNF for (1), and then to embed the grammar rules into the main body of the document where the statements are being described, or otherwise, just have a table that lists them.

I have found that having a single ABNF that covers both (1) and (2) to not really be helpful.

Rob

schoenw commented 6 years ago

I am not sure what exactly is broken but if something is broken it should be fixed. Rewriting the grammar to make it more beautiful is a non-goal since beauty is in the eye of the beholder.

abierman commented 1 month ago

Agree that bugfixes and docfixes are automatically included in yang-next. They are in the same category as Verified Errata.