yaml / yaml-spec

YAML Specification
http://yaml.org/spec/
348 stars 54 forks source link

Eliminate lookbehind in grammar #300

Open Thom1729 opened 1 year ago

Thom1729 commented 1 year ago

The 1.2.2 spec defined the parsing algorithm to be very similar to a PEG. One significant difference is that the spec grammar uses a negative lookbehind, which is not part of the PEG model. This lookbehind is used only to ensure that the string # cannot appear in a plain scalar.

In 1.2.1:

[130] ns-plain-char(c)  ::= ( ns-plain-safe(c) - “:” - “#” )
                          | ( /* An ns-char preceding */ “#” )
                          | ( “:” /* Followed by an ns-plain-safe(c) */ )

In 1.2.2, we formalized this as:

[130] ns-plain-char(c) ::=
    (
        ns-plain-safe(c)
      - c-mapping-value    # ':'
      - c-comment          # '#'
    )
  | (
      [ lookbehind = ns-char ]
      c-comment          # '#'
    )
  | (
      c-mapping-value    # ':'
      [ lookahead = ns-plain-safe(c) ]
    )

I think we can avoid the lookbehind by adding [ lookahead ≠ '#' ] in a couple of places instead.