fmease / lushui

The reference compiler of the Lushui programming language
Apache License 2.0
5 stars 0 forks source link

Improve parsing errors of expressions and patterns #76

Open fmease opened 3 years ago

fmease commented 3 years ago

Currently in some parts of the parser, we use a technique I coined reflecting which basically is some sort of backtracking system.

It is used in to parse pi type literals and (de-)applications for example to parse complex terms wrapped inside a Kleene-star. This means in cases where we allow a complex grammar term to repeat over and over, we currently use a loop or recursion depending on the associativity and try to parse as many terms as possible. The biggest issue with this are the crude error messages that this system emits. This results from the fact that it conflates the end of a repetition and a syntactic error inside the complex term that is allowed to repeat. Rephrased, when we encounter an unexpected token, we don't know if we at some point earlier already encountered the end of the repetition and we should just throw away our attempt of parsing the "last" repetition and give control back to the superparser or if the user made an error inside the complex structure and we should throw a fatal error.

Consider the following examples:

Input (pi type literal with an error inside the domain):

data X: (A: @0) -> Type

Confusing error message of right now:

error[E010]: found `:`, but expected `)`
 --> tests/parsing/plain-pi-as-argument.lushui:3:11
  |
3 | data X: (A: @0) -> Type =
  |           ^ unexpected token
  |

Input (application where the named argument is malformed and after that there's an illegal thin arrow):

main = function (A = @0) -> Type

Confusing error message:

error[E010]: found `(`, but expected line break
 --> tests/parsing/plain-pi-as-argument.lushui:6:17
  |
6 | main = function (A = @0) -> Type
  |                 ^ unexpected token
  |

In the past, we solved a lot of bad error messages by using match over successive tests (using reflection and Result::or_else) or adding a list of delimiters to the subparser. The last one is now used when parsing parameter lists and produces the best parsing error in this compiler to this date.

We should try passing delimiters to parse_application_like_or_lower and parse_pi_type_literal_or_lower and see where this gets us.

Another example:

x=lengthy-space-filler (case 0 of
    \n => n
<-)

Current output:

error[E010]: found `(`, but expected terminator
 --> bad-error-message-parsing-application.lushui:1:24
  |
1 | x=lengthy-space-filler (case 0 of
  |                        ^ unexpected token
  |