cwbaker / lalr

LALR(1) parser for C++
MIT License
76 stars 10 forks source link

Tracing syntax errors #2

Closed anselme16 closed 5 years ago

anselme16 commented 5 years ago

Hello,

I'm having trouble understanding how i am supposed to catch syntax errors while writing my grammar. Your documentation specifies :

Syntax errors during parsing are handled by backtracking until the error symbol can be accepted

but i didn't manage to use this feature. I don't understand how the error symbol must be declared, and how to connect it to the expressions to catch errors.

Could you please provide an example of syntax error handling ?

P.S. Also, syntax errors when parsing grammars are often reported as "missing dot_start symbol", which isn't very helpful, it took me some time to understand it was due to a missing semicolon.

cwbaker commented 5 years ago

Hi,

Error recovery is similar to Yacc/Bison error recovery.

List error on the right hand side of productions that you want to handle errors for. When the parser encounters a syntax error it pops its most recently parsed symbols (backtracks) until it finds a symbol that has an error production (this will symbol will be for whatever was being parsed when the syntax error occurred e.g. expression, statement, block, etc). Then the parser shifts the imaginary error symbol and continues parsing that error production. Attach a handler to error productions to handle them by outputting error messages and/or marking your input as invalid.

That's the theory. I've just tried to put something together quickly and found some bugs. I haven't used this feature much.

I'll put together an example and fix the problems but it will take me a few days to get it all done.

Thanks, Charles.

anselme16 commented 5 years ago

Thanks for your quick answer !

The bugs might be the reason why i didn't manage to make it work, i'm happy to know you're working on it.

Good luck on the bug fixes. Anselme.

cwbaker commented 5 years ago

Hi Anselme,

Please see the following now on master:

I've only done limited testing of the error handling so it wouldn't surprise me if there are more problems with it. Please let me know how it goes either way.

Thanks, Charles

anselme16 commented 5 years ago

Thanks for the fix ! I really like your library, and its minimalistic approach.

I've been playing with the error handling, and it works as expected now :)

One little issue on error handling though, when i catch an error, i can't find the line number of the error. I've tried nodes[0].line() but it always return 0.

Am i supposed to track the line number myself ? Or am i missing something in the library ?

cwbaker commented 5 years ago

No worries! Thanks for trying out the library and the feedback.

The line number wasn't propagated for non-terminals. If you get the latest from master it is now. You should be able to read the line number of the start of the error from nodes[0].line() now.

The line number from nodes[0].line() is the line number at the start of the production that generated the syntax error. I think that's most likely to be the correct line number.

anselme16 commented 5 years ago

Thanks very much ! it works like a charm !

I can close the issue then :)