always-be-clojuring / issues

16 stars 0 forks source link

Error messages and stacktraces #7

Open ericnormand opened 6 years ago

ericnormand commented 6 years ago

Stacktraces

The big problems

Stacktraces are a problem in Clojure. They are too long (tall) and they are printed in the reverse order they should be printed in and they lack any kind of formatting.

Clojure stacktraces include a lot of stack frames that are not relevant to the programmer. Most of the frames are implementation details of the language. They need filtering to get rid of lots of the irrelevant noise so that it's easier to see what code you need to look at to fix the problem.

Errors can trigger other errors, and the chain of causes is nicely tracked by the platform. However, the most important and informative error is the most recent, which is printed first. In a terminal, that means that you have to scroll up through many pages of stacktrace to find the most important thing. The most recent error should be printed last.

Exception messages and stacktraces should also be formatted nicely to make them easier to read.

Pretty solves a lot of these problems and is configurable in case we need it. I would like to submit a patch to make error messages more customized, which I'll talk about below.

Enhancements

We can make stacktraces even better by adding more information that you might want to see. Two things that I often want are:

  1. Source code

The stacktrace includes information about the file and line number and column number. In theory this can be translated into a snippet of code that shows the context of the problem and highlights the area where the exception occurred. Pyro seems to do this.

  1. Values of local variables

The stacktrace does not include local variables. Sometimes you get an exception that states that something is not a String but you don't know what it was. That information is thrown away and it would be really nice to know. StackParam claims to add information about all locals to the stack frames using external tools that connect to the JVM agent interface. It comes with prebuilt Windows and Linux binaries and instructions for building on Mac. Can we package this up for Clojure? How do we print out the information in the stacktrace?

Error messages

Error explanations

Many error messages in Clojure do not help the programmer understand what they can do about what went wrong or even to identify the correct problem. Many of the error messages are even incorrect and are mostly due to implementation details. For instance, derefing a number provides this error message: Long cannot be cast to Future.

user> @1
ClassCastException java.lang.Long cannot be cast to java.util.concurrent.Future  clojure.core/deref-future (core.clj:2290)

One must make the leap that Futures have something to do with derefing, and that @ means deref. In fact, Futures happen to be the last branch of a complex conditional inside of the implementation of deref. This error message does not try to help the programmer. It merely reveals the failure of the last branch.

Better error messages would do some of the following:

Hidden errors

Further, there are some errors that are not even caught. For instance, the function clojure.set/difference assumes that both its arguments are sets. No checks nor coercions are made. So if you pass a non-set, it has undefined results. Any kind of exception would be welcome, especially for beginners. The same goes for passing the wrong argument types to many functions. Try (keyword 1) some time.

Steps toward a solution

Spec promises to make some of this problem easier. core.specs intends to have specs for all core functions. This should make many of the hidden errors into caught errors. We should help in this effort where we can. I have reached out to Alex Miller and he seems to want help. He asked if I could make a spreadsheet with the core functions in it, organized in some way. There are a lot! You can see what I've done here. Note that these specs will sometimes enforce semantics during instrumentation that are not enforced at runtime.

Further, the error information in the spec output may be more readable than the current errors that are thrown, though that's questionable. Expound is one attempt to make the error messages more readable by formatting the error output. It's okay, but leaves a lot to be desired. One thing that is good is that you have the values of all of the arguments, which is very useful for debugging. You also know something about what they should have been but didn't live up to. What's missing is why they should have been different. And I have no hope for making that a general process--that needs human effort. But what could definitely be done in the general case is expand on the explanation. For example, if we have a spec for clojure.core/map, we can automatically generate information about each arity, like so:

(map f coll)

f: a function
coll: any collection

This can be used to great effect. Let's say someone calls (map a b), where a is 1 and b is a sequence of numbers. Here's an error message we can imagine. I'm adding comments for explanation.

There's a problem with the first argument to `map`. ;; a nice English sentence

(map  a  b)   ;; take the source code with Pyro above
          ^^       ;; identify the part of the code with the problem

The first argument you provided was 1. map is expecting a function as its first argument.
;; we have the information to do this!!

Use clojure.core/map to transform a collection of something into a sequence of something else. 
;; from the [dream docstring](https://github.com/ericnormand/ultra-docstrings/blob/master/FORMAT.md): explains the purpose

Here are the expected arguments:

(map f coll)

f: a function
coll: any collection

Maybe that's too long but you get the idea.

I think we should look to Elm for inspiration here. They've done a lot of good work coming up with Error Message formats and helpful hints.

Finally, I would like to note that spec can only provide error messages for argument type errors. It won't do anything for a FileNotFoundException, for example.

I have an idea for a general, human-centered printing of error messages. It involves using Pretty (above) which prints out the message from exceptions. However, if we convert Pretty's print function to a multimethod, we can switch on the type of the Exception. That way, we have a chance to define custom printers for different exception types. We could then print something more appropriate. I'm thinking of compiler errors, etc, that are known in the Clojure world but are not going to be caught by specs.

Next steps

  1. Define a standard error message format, based on Elm's
  2. Help develop core.specs
  3. Write an Expound-like printer that turns instrumented function error messages into standard error messages
  4. Submit patch to Pretty so that it uses a multimethod
  5. Develop custom error printers
  6. Get Pretty, Pyro, and StackParam working and printing well.
alex-dixon commented 6 years ago

This week’s thread on error messages https://clojureverse.org/t/improving-error-messages-in-clojure-as-a-library/1765

bhb commented 6 years ago

Thanks for writing this up and organizing this effort! I wholeheartedly agree that this is an area where Clojure could be improved (and I'm working on doing so in a small way with Expound). I also did not know that Pretty can be configured. That's a useful direction to explore!

Define a standard error message format, based on Elm's

If it helps, I've compiled a list of resources on error messages along with my short notes about them. Elm's error messages are fantastic, but there's also some really interesting examples from Urn, Racket, Rust, and ReasonML in the links

a nice English sentence

Based on my admittedly anecdotal experience of talking with a few Elm programmers, I think it's worth carefully considering the tradeoffs of verbose error messages like Elm has (especially the hints). I've heard that this verbose prose is quickly ignored in the best case, and annoying or misleading in the worst case.

I suspect we can get most of the benefit with succinct, but precise messages and by showing the exact code that is wrong.

Nonetheless, I agree detailed descriptions can help beginners. Perhaps an approach like Rust's would be best here, where beginners can look up an error code in the REPL?

I think your "Next Steps" are excellent in general, thanks for this! My only suggestion would be to consider the impact of Java-only solutions. The latest "State of Clojure" results states that "Interest surging from JavaScript programmers", and I would guess this trend will continue. Perhaps it's not possible to have across-the-board solutions for CLJ/CLJS but I'd like to make CLJS feel first-class as much as possible.

IMHO, there's a few open ares of research that would help this effort, even if they are just experience reports or proof of concept in a gist (i.e. not yet a library).

  1. In CLJ/CLJS, what's the best way to capture and print exceptions? I believe Maria does this with Clojurescript and Pretty & Pyro do this with Clojure, but I haven't had time to dig in and see what is possible.

  2. The best error messages show the original source with line numbers. I believe Pryo demonstrates this is possible in Clojure, but is this possible in Clojurescript generically? Figwheel does it with files on disk. Is this possible in a REPL context where forms are evaled inline?

  3. A related investigation: is the core team accepting patches for core.specs and clojure.spec? I understand a new version of Spec is underway, so I'm not sure if it's a good time to submit patches to something like this bug, which prevents a spec pretty-printer from printing the function or macro name.

Write an Expound-like printer that turns instrumented function error messages into standard error messages

I'm biased of course 😄 , but in my experience with Expound, writing a error printer for Spec is non-trivial (at least with the current implementation of Spec). I'm happy to improve Expound and add configuration to make it more seamless with other libs if that's helpful.

Thanks again for the write up!

alex-dixon commented 6 years ago

I think it's worth carefully considering the tradeoffs of verbose error messages like Elm has (especially the hints). I've heard that this verbose prose is quickly ignored in the best case, and annoying or misleading in the worst case. I suspect we can get most of the benefit with succinct, but precise messages and by showing the exact code that is wrong.

I agree! Words can be slower than code. A lot of spec errors expressed as predicates in Clojure fairly friendly -- close to what the words would be and less verbose.

This approach might still be able to support multiple languages: https://github.com/timothypratley/cban

Nonetheless, I agree detailed descriptions can help beginners. Perhaps an approach like Rust's would be best here, where beginners can look up an error code in the REPL?

I've been curious about this approach too. Typescript has codes like TS-XXXX that are easy to Google. https://github.com/Microsoft/TypeScript/blob/v2.7.2/src/compiler/diagnosticMessages.json

Thanks for weighing in and bring up these issues @bhb!

j-cr commented 6 years ago

I feel this link should be here:

https://dev.clojure.org/jira/browse/CLJ-2373

I've asked about error codes too (though I was thinking about something more readable than "TS-XXXX"-style error codes; the main idea is to allow tooling to rely on the error messages to identify specific errors), and it seems like the core team is not against this idea in general.

Do you think it'd be useful to go through RT and Compiler and collect all the places where exceptions are thrown so they can be categorised and described in detail?