nim-lang / RFCs

A repository for your Nim proposals.
136 stars 23 forks source link

Better error messages #322

Closed haxscramper closed 3 years ago

haxscramper commented 3 years ago

This RFC itself does not provide a singular actionable goal - I decided to create this issue to have a more structured list of ideas and pain points - in the last several months similar discussion happened on IRC, when someone would complain about errors, then some discussion that ends up on "current errors are suboptimal", "structured errors are good".

RFC mostly serves as a prompt for more structured discussion related to error messages, and formulation of more actionable list of requirements (potentially in other RFC).

Type mismatch errors are horrible

Current error outputs just dumps all possible overloads, without determining their validity in given context. When working on some high-level code, seeing 40+ overloads for add, when you just tried to append to let variable is insanely annoying.

By default, some mismatches are hidden, but particular logic is not entirely clear - it is possible to have one out of four overloads hidden in one case, but for mismatch on + you can get several screens worth of text.

Possible solution would be to introduce score-based system for determining validity of each overload. Possible rules might include:

Any other score rules - I just listed more obvious ones.

If all overload alternatives have very low score, enable additional heuristics for error suggestion.

Errors are not machine-readable

Currently only file(line, column) message start can be reliably retrieved from compilation error - everything else is constructed based on string interpolation.

Providing json output for compilation errors would make it easier to build additional heuristics on top of compilation errors (for example hooking up type mismatch to documentation search to provide import suggestions), simplify IDE integration, and mitigate pointless debate for changing error formatting - if someone wants to have all-colorful arrows everywhere, they will have access to all information needed.

Not categorized, not explained, based on string interpolation

Having something like rust compiler error index would be helpful for beginners, makes it less necessary to explain particular details for errors in the manual/tutorial (or worse - leaving users to figure out cause and solution themselves).

Context is not shown, no smart suggestions

Context, original source code

When compilation error happens, it is shows generated code

import sequtils

proc acceptsInt(a: int) = discard

acceptsInt(@[1,2,3].mapIt(it + 1))

Current message

Hint: used config file '/playground/nim/config/nim.cfg' [Conf]
Hint: used config file '/playground/nim/config/config.nims' [Conf]
......
/usercode/in.nim(5, 11) Error: type mismatch: got <seq[int]>
but expected one of:
proc acceptsInt(a: int)
  first type mismatch at position: 1
  required type for a: int
  but expression '
type
  OutType`gensym0 = typeof(
    block:
      var it: typeof(items(@[1, 2, 3]), typeOfIter)
      it + 1, typeOfProc)
block:
  let :tmp_4915652 = @[1, 2, 3]
  template s2_4915653(): untyped =
    :tmp_4915652

  var i`gensym0 = 0
  var result`gensym0 = newSeq(len(:tmp_4915652))
  for it in items(:tmp_4915652):
    result`gensym0[i`gensym0] = it + 1
    i`gensym0 += 1
  result`gensym0' is of type: seq[int]

expression: acceptsInt:
  type
    OutType`gensym0 = typeof(
      block:
        var it: typeof(items(@[1, 2, 3]), typeOfIter)
        it + 1, typeOfProc)
  block:
    let :tmp_4915652 = @[1, 2, 3]
    template s2_4915653(): untyped =
      :tmp_4915652

    var i`gensym0 = 0
    var result`gensym0 = newSeq(len(:tmp_4915652))
    for it in items(:tmp_4915652):
      result`gensym0[i`gensym0] = it + 1
      i`gensym0 += 1
    result`gensym0

Expected message (at least)

/usercode/in.nim(5, 11) Error: type mismatch: got <seq[int]>
but expected one of:
proc acceptsInt(a: int)
  first type mismatch at position: 1
  required type for a: int
  but expression '@[1,2,3].mapIt(it + 1)'

This is relatively mild case, but it doesn't take too much to make it completely unreadable. For example if I had something like add @[1,2,3].mapIt(it + 1) (for example), total compilation error would have 123 lines total - approximately 60x times more than code I wrote.

NOTE: if compiled with --verbosity:2 error location is show - together with 600 lines of extra code and all auto-generated garbage. Using --hints:off removes most of the noise together with original source code. Related - https://forum.nim-lang.org/t/932

Typos, MCS misuses, beginner-unfriendly errors

Beginner-unfriendly

  1. Method call syntax

    • MCS-related type mismatch: https://forum.nim-lang.org/t/7053 - special case of undefined routine - if there is a variable with the same name, user needs to know how things got interpreted. Or of there is no parenthesis around arguments, which is also indicator of possible errors.
    • confusing procs and missing fields: Nim procedure names can be easily considered an argument for a function - all it takes is write procedure instead of procedure() when passing parameter to a function. For example if you write kind notin {nnkStrLit} and don't have variable kind defined procedure kind(n: NimNode): NimNodeKind will be considered for resolution, leading to quite misleading error message.

    If you have something like ident.name 12, but ident is not defined, you will see all overloads for name in the world, and compiler will be telling you that you can't add use name with arguments ident and 12, because name is actually a proc. For example .kind causes this quite often, due to overload on NimNode

    If proc name() exists, it is not possible to get adeqate error message for missing name variable, as compiler will try to match all overloads with procvar instead.

    This can be fixed by scored mismatches - if none of the overloads make any sense (for example not a single one accepts this type for any parameter), enable additional pass and check for any identifiers that might've been misspelled.

  2. Overly verbose compilation by default

    By default compilation outputs all sorts if unnecesary information that is not particularly useful, especially for regular tasks - number of compiled C lines, compilation time, configuration file, verbose hints for compilation of C code and so on. It might be useful in some cases, but making compilation less verbose by default (and subsequently not spending each user's time on figuring out that you need -hints:off, --verbosity:0)

  3. assigning to missing field

    file: f01_assign.nim

    type
      Val* = object
        fld: string

    file: f01_main.nim

    import f01_assign
    
    var hello: Val
    hello.fld = "123"

    error output

    Error: attempting to call undeclared routine: 'fld='

    While it is certainly possible to declare `fld=` proc to overload field assign, this is not commonly used and has nothing to do with particular issue at hand - field is not exported.

  4. Special-casing some errors

    Some errors can be special-cased - missing hash()/`==` implementation, to avoid sutiations like this - yes, it might be obvious for someone that using custom types with hashtable require defining hash and equality functions, but not for everyone, especially beginners who come from dynamically typed languages.

    Hash it the most notable example - others include, (but of course not limited to)

    1. Use of function with var argument on immutable object
    2. Using addr on let variable
    3. Option[T] with missing get()

      If Option[T] does not have particular field, but T itself does qualify, provide possible way to fixing code ("Consider adding get() ?"). In general - some kind of heuristics for searching all possible solutions and providing user with common error types.

  5. Fix suggestions

    Providing fix suggestion:

    • typo corrections
    • mutability annotations
    • MCS-related error fixes

Side effect annotation - good idea, bad ergonomics

Compile-time side effect detection is an extremely useful feature for those who cares about side-effect-free code, but currently it's usability is affected by lack of any additional information about what kind of side effect was introduced, and where it was introduced.

More informative text for some messages, additional heuristics

User-defined error messages

While it is possible to create greate DSLs in nim - customized error messages are certainly lacking. There is an std/macros/error(), but it could be improved.

Some pain points for writing custom DSLs (from my experience)

Related

Side note

Some suggestions are not fully serious, or I don't have a particular opinion that can be expressed in more actionable form, but I decided to list them:

haxscramper commented 3 years ago

Closed and split into three separate RFCs per @Clyybber 's suggestion to make more focused discussion possible.