gkappler / CombinedParsers.jl

Compiled parser combinators and regular expressions in pure julia
MIT License
77 stars 10 forks source link

Bootstrapping parsers #16

Open kskyten opened 3 years ago

kskyten commented 3 years ago

Really nice work on the package! It looks like the best parser generator for Julia at the moment. Here's a few ideas I had about bootstrapping new parsers:

How would you write an importer for those grammar files (i.e. generate CombinedParser.jl code based on the above grammar specifications to develop custom interpreters)?

gkappler commented 3 years ago

Thanks a lot!

Writing such importers should be similar to the CombinedParsers.Regexp module that parses PCRE grammar. I did not get to it yet, also because I was looking for how to best add Transformations to those grammars. The seamless integration of grammar with transformations and julia type inference IMO is the most powerful feature that sets CombinedParsers apart and ahead of alternatives.

I came up with the following idea:

  1. copy & paste grammar format (bnf, Lark, antlr) and process with a macro, e.g. p=bnf"...."
  2. use deepmap_parser to insert Transformations, like transform(some_function, p, :name) or transform(p, :name => some_function, :name2 => some_other_function).
kskyten commented 3 years ago

I don't understand how the second step works. How does deepmap_parser work? Maybe it would be better to just transform the other grammars to the CombinedParser format and generate code for that. Then you could add transformations to the generated code as usual.

gkappler commented 3 years ago

Finally an update: I had to refactor deepmap_parser to follow a clear process that can bee hooked into and documented it below. (It had a lot of bottom-up dev dispatch cruft -- So even me did not really understand how it worked any more. Online version pending a new package dependeny period, coming soon.) Sorry but my timeline got crushed and I had limited time resources to put in CP.

I want to use it next for inserting map functions like described and for tracing/debugging, and logging is now more flexible and the code might be clean enough to understand from it what types of parser rewrites can be done. Looking into EBNF, I intend to use deepmap_parser to identify parser parts where recursion without moving position can occur and wrapping these parts in a RecursionStop parser that keeps track of kind of a match stack. Work in progress... (slowly as currently I must put funding first :-| )

If you have some EBNF unit test cases or can provide resources that would help a lot to speed things up!

deepmap_parser(f::Function[, mem::AbstractDict=IdDict()], x::CombinedParser,a...;kw...)

Perform a deep transformation of a x.

Default method

  1. Returns cached result if haskey(x, mem) to avoid infinite recursion.
  2. construct deep transformation dt = _deepmap_parser(f, mem, x, a...; kw...)
  3. cache and return f(dt, a...; kw...)

For a new CombinedParser, define either deepmap_parser or _deepmap_parser.

For a parser transformation f, define either