thahmann / macleod

Ontology development environment for Common Logic (CL)
Other
23 stars 9 forks source link

Error recovery using PLY #27

Open evanmrsampson opened 6 years ago

evanmrsampson commented 6 years ago

Currently, we are using the method discussed in https://github.com/thahmann/macleod/issues/9 for p_error. This method utilizes lookahead tokens to illustrate the current state of the parse.

Unfortunately, this method doesn't allow us to do any form of error recovery, as the parser has already "looked ahead" to the EOF. It seems like we can either have this stack reporting algorithm or some error recovery algorithm, but not both.

I personally lean towards trying to do error recovery, but I don't want to make changes without some agreement. Thoughts?

thahmann commented 6 years ago

What would the error recovery actually do differently than the p_error() method? I thought p_error tries to narrow down what is causing the error. One way of making it more intelligent would be to specify some rules in there based on the lookahead results that suggests what may be wrong (missing symbol, extra symbol, wrong symbol, wrong arity, etc.)

Fxhnd commented 6 years ago

The p_error() function I sketched out before will only pop off enough tokens from the token stack to balance the parentheses before returning the ERROR token to the parsing stack. The reason why parsing stops is because we don't do anything in p_error() to clear that error on the stack. If we wanted full resynchronization we'd have to keep the p_error() (always keep the p_error()) and just use the current technique to clean up the parser stack and do a parser.errok() and feed the parser the next token.

The p_error() method and adding extensions to the productions to account for the special ERROR token are not disjoint. Good explanation here. We could even leave the error token on the stack and just create extra error'd production rules (using the special ERROR token, not trying to directly match incorrect stuff). It would take more work since the different Logical.* classes will raise exceptions if you try to create them with bad types but that'd also be doable.

evanmrsampson commented 6 years ago

When I implemented the method you sketched up I don't think I fully understood what the goal was. Right now it reads way too far ahead! I will look into getting it to work right tomorrow.

Right now it prints the whole stack leaving the parser stuck on the eof. Additionally I've been writing some rules using the error token to help identify the missing/expected token. I'll update this issue in the morning when I've looked it over again.

Thanks for the speedy replies, I really appreciate it!!

Fxhnd commented 6 years ago

Absolutely! I think the code I posted in #9 at the end pretty much works except for two error conditions

  1. If there is a failure in anything up to and including the first axiom there will be a NoneType exception since there isn't a previously parsed axiom. In this case we can just backtrace up to the last encountered LPAREN and use that for balancing.
  2. If we somehow wind up with an error token on the parse stack that gets reduced to a Logical before p_error() is called it'll make funny things happen. I don't know if that's possible with our grammar but it'd be worthwhile putting a check in for.

If you have any questions about the pivot/lookahead just let me know.