kach / nearley

📜🔜🌲 Simple, fast, powerful parser toolkit for JavaScript.
https://nearley.js.org
MIT License
3.59k stars 232 forks source link

Match non-terminal other than the default (first) #556

Open averynortonsmith opened 3 years ago

averynortonsmith commented 3 years ago

From the docs:

By default, nearley attempts to parse the first nonterminal defined in the grammar.

Is there a way to parse starting at a non-terminal other than the default? The phrasing suggests that there is but I don't see an explanation on how to accomplish this in the docs or FAQ.

node 12.18.2 npx 6.14.5 nearley 2.19.7

KillyMXI commented 3 years ago

I think I also ran into the same issue. Depending on a certain condition I wanted to use a different entry point and return a result object of a different type.

Grammar constructor seems to accept a start argument to provide a rule name, but it didn't work for me. And it is also doesn't seem to be documented in the types package.

I end up with a workaround like this:

import { Parser, Grammar } from 'nearley';
import compiledRules from './grammar';

const compiledRulesAlt = { ...compiledRules, ParserStart: 'altMain' };

const compiledRules1 = (condition) ? compiledRules : compiledRulesAlt;
const parser = new Parser(Grammar.fromCompiled(compiledRules1));

// Expected following to work but it didn't:
// const parserStart = (condition) ? 'main' : 'altMain';
// const parser = new Parser(Grammar.fromCompiled(compiledRules, parserStart));
srt19170 commented 2 years ago

Both Parser and Grammar.fromCompiled allow you to specify an optional start element.

Specifying a start element with Parser works only if you let Parser create the Grammar object, i.e., you can do this:

        const parser = new nearley.Parser(Lodestone,'rhs');

but not this:

        const parser = new nearley.Parser(nearley.Grammar.fromCompiled(Lodestone),'rhs');

Looking at the top of Parser, the code ignores start if the first input is a Grammar:

    function Parser(rules, start, options) {
        if (rules instanceof Grammar) {
            var grammar = rules;
            var options = start;
        } else {

This seems to be intentional, but I fixed it to respect start this way:

    function Parser(rules, start, options) {
        if (rules instanceof Grammar) {
            var grammar = rules;
            // var options = start;
        rules.start = start || rules.start || rules.ParserStart;
        } else {

But this fix is not backwards-compatible to anyone who was calling Parser(rules, options) and relying on the old behavior.

Specifying start with Grammar.fromCompiled doesn't work because the compiled grammar (i.e., the .js file created from a .ne) always sets ParserStart to the first symbol in the grammar. So the test in this code:

    Grammar.fromCompiled = function(rules, start) {
        var lexer = rules.Lexer;
        if (rules.ParserStart) {
            start = rules.ParserStart;
            rules = rules.ParserRules;
        }

is always true, and start always gets reset to rules.ParserStart.

My fix:

    Grammar.fromCompiled = function(rules, start) {
        var lexer = rules.Lexer;
    start = start || rules.ParserStart;
    rules = rules.ParserRules || rules;

which essentially says that start overrules rules.ParserStart. That seems reasonable to me.