igordejanovic / parglare

A pure Python LR/GLR parser - http://www.igordejanovic.net/parglare/
MIT License
135 stars 32 forks source link

High-level API to embed parglare into other tools #116

Open KOLANICH opened 4 years ago

KOLANICH commented 4 years ago

Description

I have created a tool called UniGrammar that transpiles grammars in an own JSON-objects-based DSL into grammars in other DSLs, compiles them into the actual stuff that can be used, if it is needed and generates wrappers to use the parsed trees uniformly. It unifies other aspects too, such as storage and access to compiled grammars (I mean there is a gen-bundle command that compiles a grammar with all the tools and stores the artifacts (and then the bundle can be used without much attention to the tools, UniGrammarRuntime itself detects the fastest (it also stores benchmarks results within the bundle) backend available on user's system)), testing, visualization (thought visualization is backend-specific).

parglare has 2 formats, the source one and a precompiled table serialized into python lists and dicts serialized into JSON. But the precompiled table is generated with a CLI app, not via an API, and is fetched automatically based on path of a source file (the architecture of my runtime is such that the stuff is always loaded from memory because when testing I prefer not to create unneeded files, some users have SSDs and floating gate transistors have limited count of erases before they degrade to the state they are useless). Also JSON may be not the best format to store it, it has some overhead text-based formats have. I may want to replace it for example with CBOR.

So I wonder if it makes sense to

API for tracing is tricky one. Most of tools have different kind of tracing, and nkne of them visualizes the trace automatically. For example ANTLR prints into tokens and actions and errors into stdout. I guess for tracing we can use the following very generic interface: just a collection in-memory buffers, each of them has some metadata describing its purpose and format. I.e. just an object with the fields tokens: typing.Optional[str] for tokens, the purpose is to be printed into stdout, actions: typing.Optional[str] is text repr of actions, actions_graph: typing.Optional[str] is a GraphViz graph source, the purpose is to be rendered on screen or into a file. Or we may want to get the actions and tokens in an object-oriented format. I have not yet decided. Anyway, there shouldn't be any side effects, such as direct output into stdout or closing the app.

Do you consider refactoring parglare this way as acceptible?

Also it would be nice to have the reciprocal mapping, I mean transpilation from parglare grammars into UniGrammar ones. Is it better to have it within parglare or within UniGrammar?

igordejanovic commented 4 years ago

Maybe you would be interested in the discussion on #78 which covers some of the ideas you presented here (if I understood you correctly). There is a branch with the implementation (although it will need update/rework to incorporate newest parglare changes). Basically, with that approach textual parglare grammar language is just one of many possible syntaxes.