[RFC] Fault-tolerant parsing

asterite commented 1 year ago

Extracted from #53

Right now autocompletion and other features (like the hierarchy tree that's shown at the top of the file) doesn't work if the source isn't valid Crystal code but it's almost valid Crystal code. These cases happen because the user is typing a new, but incomplete, expression.

Some of these cases are already handled by Crystalline. For example:

hello = 1
hello.|

Autocompletions work fine here because, I think, if the completion happens after a dot the rest of the line is cleared to allow a more successful parsing. Like this:

# No errors anymore
hello = 1
hello

In that case we could maybe transform the code to this:

hello = 1
hello.itself

Or maybe

hello = 1
hello.`the empty string`

that is, parse it to a call where the method name is empty, if the cursor happens to be right after the dot and nothing else is coming (or a newline is coming, etc.)

Here's another example:

hello = 1

if hello.|

In this case autocompletion doesn't work because the "remove everything after the dot" trick doesn't help because the code is missing an end. One idea here is to first rewrite the code to this:

hello = 1

if hello.|; end

That is, we add a ; end when appropriate to make it pass (combined with the previous fix of the "dot empty" sequence it will parse correctly.)

Some of these ideas are also expressed here: https://bugs.ruby-lang.org/issues/19013

I think this will be mostly about heuristics, finding cases that break and trying to fix them, so it's important to have tests for this to make sure we don't introduce regressions.

If it's fine with you I can start working on this. My idea is to first have a way to add the missing ends or curly braces based only on the indentation of the code. Then have a fault-tolerant parser, or a cursor-aware parser, that will produce a valid AST in case something is missing (like the first example here.)

elbywan commented 1 year ago

If it's fine with you I can start working on this.

I think there is noone better suited for this 😉.

The current regexes are kind of a hack I put in place as a proof of concept and were never really meant to be supported in the long run.

asterite commented 1 year ago

By the way, soon after I wanted to start working on this Advent of Code started so I'm spending a bit of time on that. Then there's work, family, etc. so I think I won't have much time to work on this on December, but this is still on my radar :-)

elbywan commented 1 year ago

No worries! I'm on the same boat 🚢

elbywan / crystalline

[RFC] Fault-tolerant parsing #55