ptal / oak

A typed parser generator embedded in Rust code for Parsing Expression Grammars
Apache License 2.0
142 stars 14 forks source link

Use a semantic actions commit-list during parsing / Partial parsing #44

Open ptal opened 9 years ago

ptal commented 9 years ago

Currently, the semantic action are called whenever the sub-expression is parsed. It means that if the parsing fails upwards, the result of the semantic action will be dropped. This is terrible if the semantic action has side-effect. It also will avoid to copy arguments passed to rules (when it'll be implemented) and it'll probably optimize the whole parsing step since no semantic rule will be called for nothing.

ptal commented 9 years ago

Probably the easiest technique to implement this is to parse a first time the input without calling semantic actions or constructing the AST and to register the "good path". And to reparse it in a second pass by effectively calling the actions.

ptal commented 9 years ago

This "path" could also be used for recording a partial state where more of the input is needed to continue. It might be less efficient?

ptal commented 9 years ago

A version that does not record path is already implemented by &rule rule, &rule try to parse the input and only construct the AST with rule in case of success.

ptal commented 8 years ago

We need benchmarks for measuring how fast is the recognizing step vs. parsing step. If it is fast enough, I propose to record the path when recognizing. It is doable by compiling the choice as follow:

e1 / e2 --->

let idx = state.replay.pop().unwrap();
match idx {
  0 => [e1],
  1 => [e2]
}

During recognizing we record the corresponding index into a replay index vector. I'm not sure if it's wise to make this the default and to store replay inside the state.

ptal commented 3 years ago

I suggest a simpler version, useful to avoid calling semantic action with side effects on a global environment, by englobing semantic actions in lambda functions. For instance, e1 e2 > f will produce a function || -> f(arg1(), arg2()) where arg1, arg2 are also functions generated by e1 and e2. Later on, it is possible to pass an environment as parameter, connecting to #10. It doesn't seem too hard to implement.