DmitrySoshnikov / syntax

Syntactic analysis toolkit, language-agnostic parser generator.
MIT License
605 stars 85 forks source link

feature request: embedded in-rule actions #65

Open namiwang opened 5 years ago

namiwang commented 5 years ago

Rules in yacc/bison may contain embedded actions, it's useful when constructing complex grammar.

Do you think we could have this feature in Syntax?

reference: http://dinosaur.compilertools.net/bison/bison_6.html#SEC48

DmitrySoshnikov commented 5 years ago

@namiwang, yeah, would be a good addition to have L-attribute grammars (where each non-terminal on the RHS can have its own handler, and depend on the symbols on the left from it).

For now this partially can be emulated by "empty" (epsilon) productions. See the LhsHandler, and RhsHandler: they don't consume any characters, but have a semantic action block which is executed.

%lex

%%

\s+     /* skip whitespace */
\d+     return 'NUMBER'

/lex

%%

Expression
  : Lhs LhsHandler '+' Rhs RhsHandler {
    console.log('Result handler');
    $$ = Number($Lhs) + Number($Rhs);
  };

Lhs
  : NUMBER;

Rhs
  : NUMBER;

LhsHandler
  : { console.log('LhsHandler'); };

RhsHandler
  : { console.log('RhsHandler') };

Result:

.dmitrys:~$ syntax-cli -g ~/mid-rule.g -m lalr1 -p '5 + 4'

Parsing mode: LALR1_BY_SLR(1).

Parsing:

5 + 4

LhsHandler
RhsHandler
Result handler

✓ Accepted

Parsed value:

9
namiwang commented 5 years ago

@DmitrySoshnikov

Thanks! It totally works.

And one little issue, in most cases, the embedded action does not need a return value, currently, I have to use a hack like ||->DUMMY; $$=Dummy;.

Does syntax support a more elegant way to implement a no-return-value action?

DmitrySoshnikov commented 5 years ago

@namiwang, are you using Rust plugin?

Usually if a production doesn't have a handler, it can be omitted altogether. In Rust plugin specifically you also don't need to specify types of just propagate the value:

If an argument is just propagated without any operation, the type declarations can be omitted, as in the last production ( Expr ) where we just return the $$ = $2.

See details in this small tutorial.

Can you show a small grammar example when you have to use the dummy return value?

namiwang commented 5 years ago

@DmitrySoshnikov

Well, the most simple case would be: I wanna mutate some values only, like updating the state of the interior tokenizer, thus the action does not need a return value.

fake_embedded_action: {
    self.tokenizer.whatever();
};

will generate

fn _handler123(&mut self) -> SV {
  // Semantic values prologue.
  self.tokenizer.whatever();
  __
}

which is not a valid function with the non-existing return value __

namiwang commented 5 years ago

And another case would be, the action is used for error recovering or should panic.

DmitrySoshnikov commented 5 years ago

Gotcha, thanks. Yeah, I'll take a look into it, and will appreciate a PR in case if you get to it earlier.