Hi, I am new to nearley, having used yacc (byacc and bison) with C in the past, and written my own LL(1) generator for table driven parsers for use with nodejs. I have been looking around in the nearley code for a place to put in a common action for all rules, so that a grammar with no explicitly added actions will generate a useful concrete syntax tree, instead of what it appears to be doing now, namely the same, just without including the non-terminal names in the nodes.
While doing this, I stumbled on a TODO comment in lib/nearley.js, line 271 or so:
// TODO what if start rule is nullable?
I admit I have no idea of the context in which this comment occurs, and I'm not sure if there is some aspect of it I just failed to understand, if so I apologize. But reading it, and assuming it means what I think it means, I'd naively think that the answer is trivial:
If the start rule is nullable, then obviously only the empty input string will match this condition, and any nullable production for the start symbol could be in the parse tree - or maybe it could just be the easy way out, with an empty array.
If it is somehow still a problem that the start rule S is nullable, then a trivial solution would be to hack the grammar, adding an EndOfInput symbol and a new start rule S' -> S EndOfInput - which then of course also has to be appended to the input - I think that's a very common trick with LL parsers also?
Hi, I am new to nearley, having used yacc (byacc and bison) with C in the past, and written my own LL(1) generator for table driven parsers for use with nodejs. I have been looking around in the nearley code for a place to put in a common action for all rules, so that a grammar with no explicitly added actions will generate a useful concrete syntax tree, instead of what it appears to be doing now, namely the same, just without including the non-terminal names in the nodes.
While doing this, I stumbled on a TODO comment in lib/nearley.js, line 271 or so: // TODO what if start rule is nullable?
I admit I have no idea of the context in which this comment occurs, and I'm not sure if there is some aspect of it I just failed to understand, if so I apologize. But reading it, and assuming it means what I think it means, I'd naively think that the answer is trivial: If the start rule is nullable, then obviously only the empty input string will match this condition, and any nullable production for the start symbol could be in the parse tree - or maybe it could just be the easy way out, with an empty array. If it is somehow still a problem that the start rule S is nullable, then a trivial solution would be to hack the grammar, adding an EndOfInput symbol and a new start rule S' -> S EndOfInput - which then of course also has to be appended to the input - I think that's a very common trick with LL parsers also?