arr-ai / wbnf

ωBNF implementation
Apache License 2.0
7 stars 4 forks source link

Add a way to disable any .wrapRE's in a given rule #7

Open ghost opened 4 years ago

ghost commented 4 years ago

To be able to capture indents we need to capture leading whitespace. Currently the .wrapRE rule will block any attempts to capture whitespace so we need a way to tell the parser to not wrap part or all of a rule.

unsure how to implement this.

marcelocantos commented 4 years ago

One possible syntax is:

.wrapRE -> A | /{foo} | "bar" | /{...()...};

This excludes production A, which should be a single-terminal production, and all occurrences of /{foo} and "bar" from wrapping.

ghost commented 4 years ago

@marcelocantos New proposal.

change the wbnf grammar to:

stmt    -> COMMENT | prod | MAGICRULES;

MAGICRULES   -> wrapre=(".wrapRE" "->" RE)
               |  onlyWrap=(".wrap" "->" IDENT:"|")
                |  wrapterm=(".wrapTerm" "->" (prod=IDENT "=" "(" (@ | term)+ ")")+;

which documents 3 magic rules into the grammar itself:

If I have time on sunday I might try implementing this. (which subgrammars are supported these rules will be simpler)

marcelocantos commented 4 years ago

The idea behind the syntax I suggested was to use the existing grammar to shoehorn in the additional concepts. If we want to explicitly define them as part of the grammar, I have no huge objection, but if we're going to go to the effort, then you don't really need to pretend that they are "rules". You could provide a more specific syntax, e.g. (not proposing, just thinking out loud):

wrap (\s* /{} \s*) exclude A /{foo} "bar";
wrap (\s* /{} \s*) include IDENT x y;
wrap (block -> \s* () \s* | COMMENT);

As an alternative to the /{} and () placeholders, you could just have explicit names like re, str and term. You could also support multiple in a single wrap:

wrap (\s* (re|str) \s*) exclude A /{foo} "bar";

Obviously, the name MAGICRULE would no longer be applicable. PRAGMA seems apt.

marcelocantos commented 4 years ago

It's probably worth thinking about #19 in all of this. Regexps will eventually go away as a concept, which may or may not impact the way we think about the above.

ghost commented 4 years ago

Wow. Yeah that looks pretty powerful. Just wonder how complicated it would be to actually implement.