DiscreteTom / retsac

Text lexer and parser. Compiler frontend framework.
https://discretetom.github.io/retsac/
MIT License
7 stars 1 forks source link
bison compiler flex lexer llvm lr parser

Retsac

npm coverage build license

[!WARNING] This project is still in early development stage, the API may change frequently.

Text lexer and parser. Compiler frontend framework.

This can be used to fast prototype your own programming language compiler/translator frontend, or parse your domain specific language.

Try it online in the playground.

Installation

yarn add retsac

Features

Resources

Examples

JSON Parser

In this example, we use AdvancedBuilder to define grammar rules with +*?, define top-down traversers using traverser, and query nodes in grammar rules using $ and $$.

All conflicts are auto resolved.

Click to Expand ```ts const lexer = new Lexer.Builder() .ignore(Lexer.whitespaces()) // ignore blank characters .define({ // built-in support for JSON string: Lexer.json.stringLiteral(), number: Lexer.json.numericLiteral(), }) .define(Lexer.wordKind("true", "false", "null")) // token's kind name equals to the literal value .anonymous(Lexer.exact(..."[]{},:")) // single char borders without a kind name .build(); export const builder = new ELR.AdvancedBuilder({ lexer }) .data() .define( { value: `string | number | true | false | null` }, // eval the only child's text to get the value (d) => d.traverser(({ children }) => eval(children[0].text!)), ) .define( { value: `object | array` }, // call the only child's traverse method to get the object/array value (d) => d.traverser(({ children }) => children[0].traverse()), ) .define( // `?` for zero or one, `*` for zero or more, use `()` to group // quote literal values with `'` or `"` { array: `'[' (value (',' value)*)? ']'` }, // use `$$` to select all children with the given name // traverse all values in the array and return the result as an array (d) => d.traverser(({ $$ }) => $$(`value`).map((v) => v.traverse())), ) .define( // use `@` to rename a grammar { object_item: `string@key ':' value` }, (d) => // return an object // use `$` to select the first child with the given name d.traverser(({ $ }) => ({ // eval the key's value, then traverse child to get the entry's value [eval($(`key`)!.text!)]: $(`value`)!.traverse(), })), ) .define({ object: `'{' (object_item (',' object_item)*)? '}'` }, (d) => d.traverser(({ $$ }) => // every object_item's traverse result is an object, we need to merge them. Object.assign({}, ...$$(`object_item`).map((item) => item.traverse())), ), ); ```

Calculator

In this example, we use reducer to define bottom-up data reducers, so we can get the result when the AST is built.

There are conflicts introduced by those grammar rules, we use the high-level resolver API priority to resolve them.

Click to Expand ```ts const lexer = new Lexer.Builder() .ignore(Lexer.whitespaces()) // ignore blank characters .define({ number: /[0-9]+(?:\.[0-9]+)?/ }) .anonymous(Lexer.exact(..."+-*/()")) // operators .build(); export const builder = new ELR.ParserBuilder({ lexer }) .data() .define({ exp: "number" }, (d) => // the result of the reducer will be stored in the node's value d.reducer(({ matched }) => Number(matched[0].text)), ) .define({ exp: `'-' exp` }, (d) => d.reducer(({ values }) => -values[1]!)) .define({ exp: `'(' exp ')'` }, (d) => d.reducer(({ values }) => values[1])) .define({ exp: `exp '+' exp` }, (d) => d.reducer(({ values }) => values[0]! + values[2]!), ) .define({ exp: `exp '-' exp` }, (d) => d.reducer(({ values }) => values[0]! - values[2]!), ) .define({ exp: `exp '*' exp` }, (d) => d.reducer(({ values }) => values[0]! * values[2]!), ) .define({ exp: `exp '/' exp` }, (d) => d.reducer(({ values }) => values[0]! / values[2]!), ) .priority( { exp: `'-' exp` }, // highest priority [{ exp: `exp '*' exp` }, { exp: `exp '/' exp` }], [{ exp: `exp '+' exp` }, { exp: `exp '-' exp` }], // lowest priority ); ```

Contribute

All issues and pull requests are highly welcomed.

CHANGELOG