jsinger67 / parol

LL(k) and LALR(1) parser generator for Rust
https://jsinger67.github.io/
Apache License 2.0
189 stars 19 forks source link
grammar parser rust

Rust Docs.rs Crates.io

About parol


Logo




ATTENTION - The main branch is subject to constant changes, so the experience can be bumpy

Therefore, please use an officially released version from crates.io or refer to one of the latest tags applied to main branch.


This workspace contains four essential crates that are all separately released on crates.io.

New changes can be viewed in the change logs of the respective projects.

It also contains the vs-code extension parol-vscode which is released on VS Code marketplace parol-vscode


parol is a LL(k) and a LALR(1) parser generator for Rust.

It's an installable command line tool that can generate complete parsers from a single grammar description file including all AST data types you would otherwise had to design by yourself. parol does this solely by analyzing your language's grammar. parol is also a library that you can use in your own crates.

You can control the process of AST type generation. First you can mark elements for omission in your AST. Also you can specify your own types for language elements.

Language description and language implementation is strictly separated in parol. Thus you can design your language's grammar without any need to process anything because generated parsers function by default as acceptors. This empowers you to do a real rapid prototyping of your grammar.

parol generates a trait as interface between your language processing and the generated parser. The trait contains functions for each non-terminal of your grammar which you can implement for non-terminals you need to process. In the simplest case you only implement the trait function for the start symbol of your grammar which is called after the whole input string is parsed. This function then is called with a parameter that comprises the complete structure of the parsed document.

The parser calls the interface trait's functions via a separately generated adapter automatically during the process of parsing.

With such a generated interface trait you theoretically never have to let parol generate new code for you anymore and you can concentrate on the development of your language processing. Although, often a more iterative approach is taken.

Generated parsers

Other properties of parol

Why should you use LL(k) parsers in your language implementation?

LL parsing technique is a top-down parsing strategy that always starts from the start symbol of your grammar. This symbol becomes the root node of the parse tree. Then it tries to derive the left-most symbol first. All such symbols are then processed in a pre-order traversal. During this process the parse tree is created from the root downwards.

Both, processing the input and producing the parse tree in 'natural' direction ensures that at every point during parsing you can see where you came from and what you want to derive next. parol's parse stack contains 'End of Production' markers which reflect the 'call hierarchy' of productions.

This tremendously helps to put your language processing into operation. In contrast, anyone who has ever debugged a LR parser will remember the effect of 'coming out of nowhere'.

Although LL grammars are known to be less powerful than LR grammars many use cases exist where LL grammars are sufficient. By supporting more than one lookahead token the abilities of traditional LR(1) grammars and LL(k) grammars become more and more indistinct.

Why should you use parol?

parol is simple. You can actually understand all parts of it without broader knowledge in parsing theory.

parol is fast. The use of deterministic automata ensures a minimal overhead during parsing, no backtracking needed.

parol is a true LL(k) parser. You won't find much working LL(k) parsers out there.

parol generates beautiful code that is easy to read which fosters debugging.

parol is young. Although this might be a problem some times, especially regarding the stability of the API, the best is yet to come.

parol is actively developed. Thus new features are likely to be added as the need arises.

Documentation

Examples

This project contains some introductory grammar examples from entry level up to a more complex C-like expression language and an acceptor for Oberon-0 grammar.

A complete Oberon-2 acceptor generated by parol can be found in the examples of this repository.

A rudimentary Basic interpreter strives to mimic a small part of C64 Basic.

A TOML parser can be found here.

I also provide a JSON Parser.

parol's input language processing is an additional and very practical example.

The book

A book explains some internals and the practical use of parol in detail. It is still a work in progress but should be considered as the central documentation.

The video

This video explains the installation of parol and the language server to setup your working environment. Then it shows the process of designing grammars with parol with the help of an example project.

State of the project

parol has proven its performance in many examples and tests during its development. Also, projects ranging from small to large scale are using parol as their parser generator successfully.

As of the release of version 1.0.0 parol can be used in production like environments. Please, check the licenses for the terms of use.

Dependencies

Please note that any necessary dependencies are automatically added to your new parol project if you use the parol new subcommand to create your new crate. The following sections are therefore for information only.

Runtime library

Parsers generated by parol have to add a dependency to the parol_runtime crate. It provides the scanner and parser implementations needed. The parol_runtime crate is very lightweight.

Macros

As of version 0.13.0 you have to add the parol-macros crate to your dependencies if you use parol's auto-generation mode.

License

parol and its accompanied tools included in this workspace are free, open source and permissively licensed! Except where noted (below and/or in individual files), all code in this repository is dual-licensed under either:

at your option. This means you can select the license you prefer!

Your contributions

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.

Further readings

Contributors

Thanks to all the contributors for improving this project!