dlang / project-ideas

Collection of impactful projects in the D ecosystem
36 stars 12 forks source link

Improve D error messages #82

Closed maxhaton closed 4 months ago

maxhaton commented 3 years ago

Description

D's error messages are of sufficient quality to get by, however they are fairly uninspiring compared to what is available in other languages with similar aims (C++, Rust etc).

A short example

Let's compare the error messages due to this nonsensical function in D, and then when transliterated into forms other compilers understand.

int square(int num)
{
    return num * num + "chimp"; 
    //This exact operation was chosen so C++ would also give an error 
    //e.g. num + 0.0 is relatively kosher in the eye's of even clang
}

D says: <source>(3): Error: incompatible types for (num * num) + ("chimp") int and string

Rust Says:

error[E0277]: cannot add `&str` to `i32`
 --> <source>:3:15
  |
3 |     num * num + "chimp";
  |               ^ no implementation for `i32 + &str`
  |
  = help: the trait `Add<&str>` is not implemented for `i32`

GCC Says:

<source>: In function 'int square(int)':
<source>:3:22: error: invalid conversion from 'const char*' to 'int' [-fpermissive]
    3 |     return num * num + "chimp";
      |            ~~~~~~~~~~^~~~~~~~~
      |                      |
      |                      const char*

From these examples we can identify three things that can be improved

  1. D should provide an annotated quotation of the offending code.
  2. D should attempt to be a little more prosaic in the text of it's messages, that is, we find that we can't do x because of y, then we proceed to print y but not x
  3. D should get into the habit (more on this further down) of providing information in a structured manner. This extends beyond aesthetics - to take an example directly from programming, good code doesn't need indentation because it's pretty but rather because it communicates at an almost subconscious level what the flow of the code does.

These problems get worse as the code get's deeper - i.e. it's easy for template error messages like

<source>(3): Error: incompatible types for `(num * num) + ("chimp")`: `int` and `string`
<source>(6): Error: template instance `example.square!int` error instantiating

to get lost in a soup of error message output.

In this case, the compiler needs a mechanism to represent an Error (message) in the abstract as it goes through the compiler. Currently, the basic currency the frontend deals with is just the string

nothrow void error(ref const Loc loc, const(char)* format, ...);

This is clearly not good enough in 2021. There are areas of the compiler which effectively build their formatting code (like template constraints) - while greatly appreciated these changes are not an effective model for the future.

What are rough milestones of this project?

Introduce a compiler flag to turn all new features on or off

Introducing changes in this area will likely mean updating a very large number of dmd tests - this can be done incrementally if the functionality is turned on.

Decide on a simple datastructure to handle these error messages while they are being constructed

And an API. This doesn't have to be complicated, just general enough to be passed around safely and supporting having information added to it (notes, references to the docs etc.) in a structured manner.

Implement the source quotations (wiggles and carats)

When the previous step is done, this feature can be implemented. The existing error message implementation can be rewritten to silently forward to nu-error-messages.

This should be fairly generic code, so it can be reused in less common paths like having two quotations in one error.

Rewrite existing logic to understand these new error messages

For example, templates instantiations should be able to effortlessly collect the errors that they cause - i.e so they can be presented in a more useful manner.

An arbitrary target: No dmd code should use explicit formatting for errors (e.g. \n) eventually.

Think about the feasibility of adding notes and tips

If we collect some statistics on the most common error messages (this may be somewhat challenging logistically but we'll see) either directly or by looking at the forums we can add some helpful notes for the programmer or simply a reference to an explanation on dlang.org

These later steps are more for laying down a framework that can easily be added to than full completion.

How does this project help the D community?

More informative error messages serve two main purposes, the first is to make explicit what the compiler considers to be an error and where, and the second is to aid newcomers to the D programming language by pointing them in the right direction as to how to fix their code.

Recommended skills

What can students expect to get out of doing this project?

Rating

Medium

Project Type

Core development

Point of Contact

@maxhaton @RazvanN7

Geod24 commented 3 years ago

I would say that this is a rather subjective task. I, for one, prefer DMD's conciseness over what GCC and Rust produce for this specific message. Some other messages needs to be improved, but I think this isn't the best example.

RazvanN7 commented 3 years ago

@Geod24 I think that it would be useful to implement a more advanced means of passing errors between different contexts. For example, when you compile with gagged errors, it would be extremely useful to have a list of errors that were issued and take specific actions depending on what failed. Currently, that is impossible to do with the archaic machinery that is implemented today.

RazvanN7 commented 3 years ago

@maxhaton I think that the project should simply be to update the error reporting mechanism and later on, on a case by case basis see if we could improve errors (if there's any time left).

maxhaton commented 3 years ago

I would say that this is a rather subjective task. I, for one, prefer DMD's conciseness over what GCC and Rust produce for this specific message. Some other messages needs to be improved, but I think this isn't the best example.

Its more about infrastructure than these specific examples. For example, the error messages the memory safety analyses (specifically OB) give are diabolically awful at their worst, because they basically get printed out in a non-deterministic and flat manner which not only requires jumping around for the user but also discourages structure because this has to be implemented by the compiler writer.

D's type system isn't complicated enough to need a huge amount of context for those kinds of errors, but there are some cases involving either gagging or (semantically) nested errors that are fairly unhelpful to both the user and the code in the compiler.

On the subject of dmd in particular, this has the potential to hopefully get us into the habit of getting functionality under one roof inside the codebase rather than the death-by-a-thousand-special-cases that can be found in dmd.

And anyone commenting on this thread probably has enough D experience to have actually seen the code in dmd that gives the message let alone actually need them so we probably aren't the best judges of the aesthetics.

maxhaton commented 3 years ago

@maxhaton I think that the project should simply be to update the error reporting mechanism and later on, on a case by case basis see if we could improve errors (if there's any time left).

The key part is the actual code for building the errors and outputting them so the actual text isn't a huge problem, however we get some of this for free by making it easy to pass information down the compiler which (when an error is reached) encourages a more descriptive error while also avoiding boilerplate everywhere.

Updating the test case alone will probably take a while so there shouldn't be any (say) % coverage target.

burner commented 3 years ago

yes please, this should also allow:

chances commented 2 years ago

These problems get worse as the code get's deeper - i.e. it's easy for template error messages like

<source>(3): Error: incompatible types for `(num * num) + ("chimp")`: `int` and `string`
<source>(6): Error: template instance `example.square!int` error instantiating

to get lost in a soup of error message output.

I for one would very much appreciate improvements in this area, especially with regard to debugging complex templates.

For example, this template of mine bends over backwards to emit helpful diagnostics when it's misused. See also these templates:

ichordev commented 2 years ago

I would say that this is a rather subjective task. I, for one, prefer DMD's conciseness over what GCC and Rust produce for this specific message. Some other messages needs to be improved, but I think this isn't the best example.

I second this. Improved error messages would be great, as long as they're not as tall as GCC's, for instance. GCC's error messages are very hard to read when there's more than 2 or 3.

ntrel commented 1 year ago

@maxhaton

D should provide an annotated quotation of the offending code.

dmd -verrors=context already does this:

../old/sample/enuminit.d(5): Error: cannot implicitly convert expression `E.a` of type `E` to `void*`
    void* a = E.init;  // L5
              ^
maxhaton commented 1 year ago

@maxhaton

D should provide an annotated quotation of the offending code.

dmd -verrors=context already does this:

../old/sample/enuminit.d(5): Error: cannot implicitly convert expression `E.a` of type `E` to `void*`
    void* a = E.init;  // L5
              ^

I know. "It's under a flag" doesn't really cut it. Also it has been tried (unsuccessfully) to make it on by default.

mdparker commented 4 months ago

This was added to the root folder, so we can close the issue.