haskell / error-messages

72 stars 18 forks source link

Type error message overhaul #541

Open noughtmare opened 1 year ago

noughtmare commented 1 year ago

I got inspired to write this after watching this video about teaching Haskell to kids.

I am now convinced Haskell's error messages could be improved greatly and I have very specific suggestions.

But let's first look at the example error message:

Hangman.hs:46:32: error:
    * Couldn't match type '[Char]' with 'Char'
      Expected type: Char
        Actual type: String
    * In the first argument of 'makeGuess', namely 'letterInput'
      In the first argument of 'gameLoop', namely
        '(makeGuess letterInput gs)'
      In the expression: gameLoop (makeGuess letterInput gs)
   |
46 |       else gameLoop (makeGuess letterInput gs)
   |                                ^^^^^^^^^^^

The speaker notes these problems:

And it is compared with this roughly equivalent Elm error message:

-- TYPE MISMATCH --------------------------------------------------- Hangman.elm

The 1st argument to `makeGuess` is not what I expect:

6| main = Html.text (makeGuess "Five")
                               ^^^^^^
This argument is a string of type:

    String

But `makeGuess` needs the 1st argument to be:

    Char

The speaker encourages GHC contributors to make Haskell's error messages look more like Elm's.

I think it is good to consider this example in a bit more detail and try to extract what exactly it is that makes Elm's message much easier to read and also what things would be harder to change in Haskell due to fundamental differences in language design.

First I'll list a few obstacles:

  1. Due to currying we cannot easily know how much arguments the programmer intends his functions to have.
  2. Type synonyms (and more advanced things like type families) can obfuscate types in error messages.
  3. Specifically, the type synonym String = [Char] is confusing, but impossible to change because it would break too much code.
  4. Polymorphism causes long distance type conflicts

Here are some general principles I can extract from Elm's approach:

  1. Use as little text as possible
  2. Use as little formatting (indentation, bullet lists) as possible.
  3. Make space
  4. Make the subject clear before explaining the details of the problem
  5. Write full sentences
  6. Write special cases for common patterns
  7. Start with the actual code the programmer has written (first actual then expected)
  8. Move the file name to the right to make it easy to identify when scrolling through all messages

Some low hanging fruit:

  1. Remove the in the ... of lines
  2. Remove the bullets and indentation
  3. Move the location information up
  4. Shorten the filename:line:column: error: line to just the the type of error and the filename
  5. Swap the order of actual and expected
  6. Make space

Then we can already get something like this:

-- Type mismatch ---------------------------------------------------- Hangman.hs 

46 |       else gameLoop (makeGuess letterInput gs)
                                    ^^^^^^^^^^^
Couldn't match type '[Char]' with 'Char'

Actual type: 

    String

Expected type: 

    Char

I think this can still be improved further, but this is a good start.

Do you think this is an improvement? Do you agree with my principles for better error messages?

goldfirere commented 1 year ago

Thanks for starting this conversation!

I'm a huge proponent of being guided by high-level principles. I think it's good to do so here, but I'd encourage us to not cleave to that idea too closely. In the end, we're trying to improve our communication to humans, and humans can be squishy.

Specific responses:

I hope these points (especially the last one) don't slow you down! I think revamping these messages can have a drastic impact on the learnability of Haskell. Thanks again for starting the process!

scarf005 commented 1 year ago

"Write full sentences" and "write as little as possible" are in conflict. I don't see a great principle in this space -- we just need to use good judgement.

Maybe we could add verbosity flag to control this behavior. For example:

Without --verbose flag

-- Type mismatch ---------------------------------------------------- Hangman.hs 

46 |       else gameLoop (makeGuess letterInput gs)
                                    ^^^^^^^^^^^
Couldn't match type 'String' with 'Char'

Actual type:

    String

Expected type:

    Char

To get more detailed error message, run the same command with --verbose flag.
For example: "runghc --verbose Hangman.hs"

with --verbose flag

-- Type mismatch ---------------------------------------------------- Hangman.hs 

46 |       else gameLoop (makeGuess letterInput gs)
                          ~~~~~~~~~ ^^^^^^^^^^^

error:
    'makeGuess' has following type signature:

    makeGuess :: Char -> String -> String
        -- Defined in ‘./Hangman.hs:1’

    It expects type 'Char' as its 1st argument.

    However, 'letterInput' has following type signature:

    letterInput :: String
        -- Type inferred from ‘./Hangman.hs:2’

    Therefore, 'makeGuess' cannot take 'letterInput' as its 1st argument
    Because it expected 'Char' but got 'String'.

hint:
    Perhaps you mistakenly changed the type of 'letterInput' to 'String' instead of 'Char'.

    'Char' is a single character, and defined using single quotes:

        thisIsAChar :: Char
        thisIsAChar = 'a'

    'String' is a list of 'Char', and can be defined in following ways:

    With double quotes:

        thisIsAString :: String
        thisIsAString = "a"

    With list of 'Char's:

        thisTooIsAString :: String
        thisTooIsAString = ['c']

    This is because String is an alias for list of 'Char's. 
    In haskell, A list of 'Type' is written as [Type].
    As 'String' is list of 'Char',

        type String :: Type
        type String = [Char]
                -- Defined in ‘GHC.Base’
noughtmare commented 1 year ago

@scarf005 my initial reaction was that it seems like that such extended messages are more suitable for the error message index. However, one disadvantage of that is that it only knows the error codes, while the compiler knows that for example the String type is involved in the error.

I do still think this is not a responsibility of GHC itself, though. Maybe the best place to put those kinds of contextual hints is in a HLS plugin? I think that is a feasible approach. HLS could even suggest a code action to apply the suggestion.

scarf005 commented 1 year ago

I do still think this is not a responsibility of GHC itself, though. Maybe the best place to put those kinds of contextual hints is in a HLS plugin?

I agree, however it would be great to be able to get contextual hints on CLI like cargo check or cargo clippy.

typechecks could be done manually with haskell-language-server-wrapper typecheck test.hs but the output is rather verbose and installing HLS is not easy for beginners.

noughtmare commented 1 year ago

That's a good point. Still, I'd be tempted to say the best solution to that problem is to improve how the HLS typecheck command works.

blamario commented 1 year ago

One unmentioned opportunity would be to shorten the first three lines

    * Couldn't match type '[Char]' with 'Char'
      Expected type: Char
        Actual type: String

to

    * Couldn't match expected 'Char' with the actual type 'String' = '[Char]'

This seems to be the only line reported in simple cases, when there are no type synonyms and such complications.

I'd appreciate having the actual/expected qualifiers in that first line even if the following two lines are kept. The reason is that in more complex type errors the first line will give you only the mismatching part of the two types, while the following two lines will dump the two full types which may be much longer. It's not obvious which is which in the first line.

Here's a relatively benign example I just produced:

    • Couldn't match type ‘TH.Type’ with ‘FunDep’
      Expected: ExtAST.Type Language Language f f -> FunDep
        Actual: ExtAST.Type Language Language f f -> TH.Type

Sometimes I get types spanning multiple lines and it's not at all trivial to visually locate the mismatch. Ideally the mismatched type parts would be marked with an underline like we already get for the expression:

    • Couldn't match type ‘TH.Type’ with ‘FunDep’
      Expected: ExtAST.Type Language Language f f -> FunDep
                                                     ^^^^^^
        Actual: ExtAST.Type Language Language f f -> TH.Type
                                                     ^^^^^^^
googleson78 commented 1 year ago

Would be awesome to have "big expression with underlined relevant mismatching subexpression"