dhall-lang / dhall-haskell

Maintainable configuration files
https://dhall-lang.org/
BSD 3-Clause "New" or "Revised" License
918 stars 214 forks source link

Poor parse error message with parenthesis-enclosed function application #1592

Open SiriusStarr opened 4 years ago

SiriusStarr commented 4 years ago

If a parse error occurs within parentheses in function application, instead of an error at the point of the parse error, you instead get unexpected '(' at the open parenthesis. For example:

> dhall <<< '(3f)'
Error: Invalid input

(stdin):1:3:
1 | (3f)
  |   ^
unexpected 'f'

vs

> dhall <<< 'a (3f)'
Error: Invalid input

(stdin):1:3:
1 | a (3f)
  |   ^
unexpected '('

This does not itself seem particularly problematic, until the expression within the parentheses is complex and the parse error difficult to see at a glance. For example:

1 | Optional/fold Text optional Text (λ(text : Text) → "\latexcommand{${text}}") ""
  |                                  ^
unexpected '('

You can stare at the above for quite a long time before you notice the real problem, namely that \l is not a valid escape sequence, and it should be "\\latexcommand{${text}}". (Don't ask me how I know this...)

Gabriella439 commented 4 years ago

As far as I can tell, the only way to fix this is to make two changes:

I don't think there are any shortcuts left at this point for fixing this particular issue

Profpatsch commented 4 years ago

This is a particularly aggravating issue when you have a long file, example I had this typo:

screenshot

on line 1238, and it caused this error message:

> dhall --file ./log2.dhall
dhall: 
Error: Invalid input

./log2.dhall:1217:11:
     |
1217 |           [ tweag
     |           ^
unexpected '['
expecting ',', ->, :, ], operator, or whitespace

That’s two screen heights above the actual error at the beginning of the nested list!

sjakobi commented 4 years ago

For the record: https://github.com/dhall-lang/dhall-haskell/pull/1740 was the first step to implement the plan in https://github.com/dhall-lang/dhall-haskell/issues/1592#issuecomment-562965613.

sjakobi commented 4 years ago

I recently had a run-in with this issue. I think I would have torn my hair out, if it was long enough to do so! ;)

@Gabriel439 I see that you've mentioned generating the parser with happy above. I've done small edits on GHC's happy code, but I don't have a good understanding of it. Would we generate the parser from dhall.abnf, or a modified version of it?

german1608 commented 4 years ago

Would we generate the parser from dhall.abnf, or a modified version of it?

Happy doesn't directly work with that specific format so we need to port it to happy syntax. The only problem that we might encounter by porting dhall.abnf to happy syntax is that happy works much better on a right-recursive grammar and dhall.abnf spec is mostly left-recursive (indirectly).

Gabriella439 commented 4 years ago

@sjakobi: Here's an example of what an alex lexer and happy parser might look like, from an older project of mine:

However, I later switched to Earley for the reasons outlined in this commit description:

https://github.com/Gabriel439/Haskell-Morte-Library/commit/f6820953765e5d518469fbd47834215394626068

sjakobi commented 4 years ago

@Gabriel439 Thanks! With Earley we'd still have a hand-written parser, right? It can't be generated from dhall.abnf as some other implementations seem to do?

Gabriella439 commented 4 years ago

@sjakobi: Yeah, it would still be hand-written, but it would no longer require backtracking (which is the main reason for poor error messages) or tricks to avoid pathological performance. The main downside of Earley is that it might have slower constant factors because it doesn't support fast bulk operations like megaparsec does

Profpatsch commented 4 years ago

Morte … I just now realized that the Planescape naming scheme is older than Dhall :)

Gabriella439 commented 4 years ago

@Profpatsch: Yep! That's correct :slightly_smiling_face:

sjakobi commented 3 years ago

@ocharles just shared his Earley-based parser in https://hub.darcs.net/ocharles/dhalli/browse/lib/Dhalli/Syntax.hs with me. Might be a good resource when we finally make the switch! :)

TheMatten commented 3 years ago

BTW, Happy let's you export multiple parsers these days, and it can report expected tokens (output may need some tidying, but it is doable).

sjakobi commented 2 years ago

By chance I've come across headed-megaparsec, which seems to have the purpose of providing better error messages (see https://hackage.haskell.org/package/headed-megaparsec-0.2.0.2/docs/HeadedMegaparsec.html#t:HeadedParsec).

I'm wondering whether it might be useful for resolving this issue without abandoning the current parser code entirely. Currently I don't properly understand what it does and how it works though.

Profpatsch commented 2 years ago

It seems the parser is backtracking too much (I’m bitten by this a lot when writing any “serious” dhall expressions).

Can the earley parser handle that correctly? If so, what’s hindering us from using @ocharles version?

Gabriella439 commented 2 years ago

It's probably possible to switch to an Earley parser. It just requires a lot of work. We can't use @ocharles's version exactly as written because there are too many subtle differences, but we could consult it for writing the equivalent Dhall parser.