microsoft / ts-parsec

Writing a custom parser is a fairly common need. Although there are already parser combinators in others languages, TypeScript provides a powerful and well-structured foundation for building this. Common parser combinators’ weakness are error handling and ambiguity resolving, but these are ts-parsec’s important features. Additionally, ts-parsec provides a very easy to use programming interface, that could help people to build programming-language-scale parsers in just a few hours. This technology has already been used in Microsoft/react-native-tscodegen.
Other
353 stars 18 forks source link

Simplify defining mutually dependent parsers #19

Closed mister-what closed 3 years ago

mister-what commented 4 years ago

Motivation

Using the rule parser alone does not enforce enough structure and level of abstraction when maintaining parsers that contain more than a hand full of mutually dependent definitions. This PR tries to give consumers a tool to make combining and composing parsers at scale less of a headache. The concept introduced in this PR allows to structure mutually dependent parsers in a modular and coherent way while keeping the interface surface as small as possible. The concept is functionally equivalent to the rules parser concept:

(From https://github.com/mister-what/ts-parsec/blob/bcacd974a89b5edf8b5705fb070b7be829a68331/packages/tspc-test/src/TestParserModule.ts#L65-L101)

const parserModule = makeParserModule(
    {
        /*
         * TERM
         *  = NUMBER
         *  = ('+' | '-') TERM
         *  = '(' EXP ')'
         */
        TERM(m: { TERM: Parser<TokenKind, number>; EXP: Parser<TokenKind, number> }): Parser<TokenKind, number> {
            return alt(
                apply(tok(TokenKind.Number), applyNumber),
                apply(seq(alt(str('+'), str('-')), m.TERM), applyUnary),
                kmid(str('('), m.EXP, str(')'))
            );
        },
        /*
         * FACTOR
         *  = TERM
         *  = FACTOR ('*' | '/') TERM
         */
        FACTOR(m: { TERM: Parser<TokenKind, number> }): Parser<TokenKind, number> {
            return lrec_sc(m.TERM, seq(alt(str('*'), str('/')), m.TERM), applyBinary);
        },
        /*
         *EXP
         *  = FACTOR
         *  = EXP ('+' | '-') FACTOR
         */
        EXP(m: { FACTOR: Parser<TokenKind, number> }): Parser<TokenKind, number> {
            return lrec_sc(m.FACTOR, seq(alt(str('+'), str('-')), m.FACTOR), applyBinary);
        }
    }
);

function evaluate(expr: string): number {
    return expectSingleResult(expectEOF(parserModule.EXP.parse(lexer.parse(expr))));
}

Changes

ghost commented 4 years ago

CLA assistant check
All CLA requirements met.