jeremyjh / dialyxir

Mix tasks to simplify use of Dialyzer in Elixir projects.
Apache License 2.0
1.7k stars 139 forks source link

Provide better error messages #103

Closed josevalim closed 6 years ago

josevalim commented 6 years ago

Hi folks, first off, thanks for Dialyxir! I would like to request a feature. :)

Dialyzer error messages are not good. They are very cryptic and hard to understand. When integrating via Elixir is even worse, as terms are formatted in Erlang syntax. Good news is: it is not our fault and we can do better!

We already receive all of the warnings from dialyzer here. So we can take the liberty of formatting them in better ways. Rebar3 does that. Just don't forget the catch all clause with a behaviour that is similar to today's.

How should the error messages look like? This Elm article is an excellent guideline but we need to say what went wrong, why it went wrong and propose next steps.

The Credo folks are also good on their error messages so it can also be a good starting point.

FWIW, Elixir does a very similar thing to what I am proposing here, except we are handling the warnings coming from compiler.

Although I cannot directly work on this feature, I will be glad to provide guidance. Folks have mentioned the difficulty in handling those error messages in the past so I thought I would pass the massage forward.

jeremyjh commented 6 years ago

@josevalim I agree this is the next step for dialyxir and what the community needs to really begin to make use of success typing analysis at scale. Thanks for opening this dialog and providing these suggestions as a starting point.

This is something I will contrib too but help from the community is definitely wanted; particularly in interpreting the Erlang terms into something usable. I agree Credo does provide a great model for how errors can be communicated and explained and it is not hard to provide better phrasing for the dialyzer warnings.

asummers commented 6 years ago

Okay so I'm taking steps towards this. Thoughts:

https://github.com/jeremyjh/dialyxir/blob/master/lib/dialyxir/dialyzer.ex#L23 seems to be where we want to intercept control flow (as said in original post) and If we dig into the Erlang code there, we find https://github.com/simplegeo/erlang/blob/15eda8de27ba73d176c7eeb3a70a64167f50e2c4/lib/dialyzer/src/dialyzer.erl#L270 which seems to be just taking in a tuple, pattern matching on the warning type, and then creating some strings to output the error. My initial stab is taking that Erlang logic and reproducing it in Elixir, with an easier barrier to entry here than upstream Erlang. From there, we can do things like parsing and reformatting the maps, structs, doing diffs, or any number of things we'd like, such that whenever a message is unclear or difficult to parse visually (lack of blank lines, e.g.) it should be considered a bug. My thought is that we need to elevate options of which errors you'd like to see (Dialyzer, Dialyxir, short, both), but also have to consider that people have .dialyzer-ignore-warnings files, so maybe we just always print both and continue using the old warnings for this as in (I believe) https://github.com/jeremyjh/dialyxir/blob/master/lib/dialyxir/dialyzer.ex#L20.

Proposed example:

If we look at https://github.com/simplegeo/erlang/blob/15eda8de27ba73d176c7eeb3a70a64167f50e2c4/lib/dialyzer/src/dialyzer.erl#L355 we might imagine it in Elixir as

  defp message_to_string({:contract_diff, [module, function, _args, contract, signature]}) do
    "Type specification #{module}:#{function}#{contract} is not equal to the success typing: #{module}:#{function}#{signature}."
  end

which we might change to:

  defp message_to_string({:contract_diff, [module, function, _args, contract, signature]}) do
    """
    Type specification is not equal to the success typing.

    This happens when either the function is not returning the proper
    value, or the `@spec` is incorrect.

    Function:
    #{module}:#{function}

    Provided typing:
    #{contract}

    Success typing:
    #{signature}
    """
  end

eventually adding tools for diffs (as in ExUnit), and more Elixir-ey formatting of the various output.

Please let me know your all's thoughts on any of the above, particularly with respect to the ignore warnings question, or if this seems entirely off the mark.

I've already ported over a good deal of the Dialyzer logic in the above fashion, and am going to continue with that until I can get it functionally integrated with Dialyxir at large in a sane way.

Other questions: Do we have thoughts on the best way to format the Erlang terms themselves to be more Elixir friendly?

You can see my progress here: https://github.com/jeremyjh/dialyxir/compare/master...asummers:elixir-formatter

jeremyjh commented 6 years ago

@asummers I agree that looks like the right approach so you've got a good start on it.

In my opinion, the worst part about dialyzer errors in Elixir is that it is very difficult to read the erlangified rendition of the struct types in Elixir. For a newcomer, it is completely hopeless that they look at the below example, and see %Plug.Conn{} is the first arg; even for me this is a nightmare because while I can easily tell what the first arg is, how do I tell where the second even starts? I think at least resolving this much is an MVP for this feature. I'm not sure that's easy, as it looks like the args returned to us by dialyzer are just a string; we'll have to parse them.

The call Elixir.Phoenix.Controller:render(#{'__struct__':='Elixir.Plug.Conn', 'adapter':={atom(),_}, 'assigns':=#{atom()=>_}, 'before_send':=[fun((#{'__struct__':='Elixir.Plug.Conn', 'adapter':={_,_}, 'assigns':=map(), 'before_send':=[fun((_) -> any())], _=>_}) -> #{'__struct__':='Elixir.Plug.Conn', 'adapter':={_,_}, 'assigns':=map(), 'before_send':=[fun((_) -> any())], _=>_})], 'body_params':=#{'__struct__'=>'Elixir.Plug.Conn.Unfetched', 'aspect'=>atom(), binary()=>binary() | [binary() | [any()] | #{binary()=>_}] | #{binary()=>binary() | [any()] | #{binary()=>_}}}, 'cookies':=#{'__struct__'=>'Elixir.Plug.Conn.Unfetched', 'aspect'=>atom(), binary()=>binary()}, 'halted':=_, 'host':=binary(), 'method':=binary(), 'owner':=pid(), 'params':=#{'__struct__'=>'Elixir.Plug.Conn.Unfetched', 'aspect'=>atom(), binary()=>binary() | [binary() | [any()] | #{binary()=>_}] | #{binary()=>binary() | [any()] | #{binary()=>_}}}, 'path_info':=[binary()], 'path_params':=#{binary()=>binary() | [binary() | [any()] | #{binary()=>_}] | #{binary()=>binary() | [any()] | #{binary()=>_}}}, 'peer':={{byte(),byte(),byte(),byte()} | {char(),char(),char(),char(),char(),char(),char(),char()},char()}, 'port':=char(), 'private':=#{atom()=>_}, 'query_params':=#{'__struct__'=>'Elixir.Plug.Conn.Unfetched', 'aspect'=>atom(), binary()=>binary() | [binary() | [any()] | #{binary()=>_}] | #{binary()=>binary() | [any()] | #{binary()=>_}}}, 'query_string':=binary(), 'remote_ip':={byte(),byte(),byte(),byte()} | {char(),char(),char(),char(),char(),char(),char(),char()}, 'req_cookies':=#{'__struct__'=>'Elixir.Plug.Conn.Unfetched', 'aspect'=>atom(), binary()=>binary()}, 'req_headers':=[{binary(),binary()}], 'request_path':=binary(), 'resp_body':='nil' | binary() | maybe_improper_list(binary() | maybe_improper_list(any(),binary() | []) | byte(),binary() | []), 'resp_cookies':=#{binary()=>#{}}, 'resp_headers':=[{binary(),binary()}], 'scheme':='http' | 'https', 'script_name':=[binary()], 'secret_key_base':='nil' | binary(), 'state':='chunked' | 'file' | 'sent' | 'set' | 'set_chunked' | 'set_file' | 'unset', 'status':='nil' | 1..1114111},'Elixir.Hello13Web.ErrorView','404') breaks the contract ('Elixir.Plug.Conn':t(),binary() | atom(),'Elixir.Keyword':t() | map()) -> 'Elixir.Plug.Conn':t().

asummers commented 6 years ago

Update: I'm working on a parser/pretty printer. Turns out Mix generates .erl files if you include .xrl and .yrl files in the src directory corresponding to lexer and parser, respectively. I have it parsing the first part of that input appropriately and spitting out something decently close to right:

(%Plug.Conn{:adapter => {atom(), any()},:assigns => %{atom() => any()},:before_send => [((%Plug.Conn{:adapter => {any(), any()},:assigns => map(),:before_send => [((any()) -> any())],any() => any()}) -> %Plug.Conn{:adapter => {any(), any()},:assigns => map(),:before_send => [((any()) -> any())],any() => any()})],:body_params => %Plug.Conn.Unfetched{:aspect => atom(),binary() => binary() | [binary() | [any()] | %{binary() => any()}] | %{binary() => binary() | [any()] | %{binary() => any()}}},:cookies => %Plug.Conn.Unfetched{:aspect => atom(),binary() => binary()},:halted => any(),:host => binary(),:method => binary(),:owner => pid(),:params => %Plug.Conn.Unfetched{:aspect => atom(),binary() => binary() | [binary() | [any()] | %{binary() => any()}] | %{binary() => binary() | [any()] | %{binary() => any()}}},:path_info => [binary()],:path_params => %{binary() => binary() | [binary() | [any()] | %{binary() => any()}] | %{binary() => binary() | [any()] | %{binary() => any()}}},:peer => {{byte(), byte(), byte(), byte()} | {char(), char(), char(), char(), char(), char(), char(), char()}, char()},:port => char(),:private => %{atom() => any()},:query_params => %Plug.Conn.Unfetched{:aspect => atom(),binary() => binary() | [binary() | [any()] | %{binary() => any()}] | %{binary() => binary() | [any()] | %{binary() => any()}}},:query_string => binary(),:remote_ip => {byte(), byte(), byte(), byte()} | {char(), char(), char(), char(), char(), char(), char(), char()},:req_cookies => %Plug.Conn.Unfetched{:aspect => atom(),binary() => binary()},:req_headers => [{binary(), binary()}],:request_path => binary(),:resp_body => nil | binary() | maybe_improper_list(binary() | maybe_improper_list(any(), binary() | []) | byte(), binary() | []),:resp_cookies => %{binary() => %{}},:resp_headers => [{binary(), binary()}],:scheme => :http | :https,:script_name => [binary()],:secret_key_base => nil | binary(),:state => :chunked | :file | :sent | :set | :set_chunked | :set_file | :unset,:status => nil | :1..1114111}, Hello13Web.ErrorView, 404)

What I'd LOVE is a way to leverage the Elixir 1.6 formatter somehow. Maybe have it generate a fake module where it drops this code into an @spec or something, formats, then strips away the defmodule etc, or something? Not quite sure how I envision that working yet.

jeremyjh commented 6 years ago

@asummers That is certainly a lot better but honestly I'd prefer to just see %Plug.Conn{} rather than all its members. Its the inclusion of all the members that makes it hard to see where the next argument begins.

OvermindDL1 commented 6 years ago

I'd find it best if a diff could be done between both the expected and actual and to show and/or focus on the parts that are different, preferably with colors (see Elixir's 1.6+ matcher exceptions, should follow the same style).

asummers commented 6 years ago

Just a quick update on where things are at with my branch as I did not get to work on this very much over the holiday break.

Interception of error messages and reformatting into Elixir messages: done Basic (incomplete) lexing and parsing of structure of contract and struct messages: done Recast Erlang style output via pretty printer to Elixir: done Hook up Erlang->Elixir pretty printer for all places in error output: done

Here's where there's updates:

I wanted to take the resultant pretty printed Elixir contracts and such and pretty print them with indentation and such. To accomplish this, I make a minimal function to trick the Elixir 1.6 code formatter into pretty printing that, then strip away the function head. This feels atrociously hacky, but it seems to work reasonably okay. In parsing the error messages, (Elixir 1.6 gives new Dialyzer messages on production code at work, interestingly enough), I found that sometimes Dialyzer truncates the terms in lists with ... and maps/structs with _ representing key and value (I turn this into ...), which are both annoying limitations in the source tool. However the code formatter doesn't seem to be bothered by it, so it seems reasonable to leave in the Erlang->Elixir pretty printed output.

All this to say, I generally have tools to output these messages however we want now, within the parameters of what we're given by Dialyzer. As far as the comment from @OvermindDL1, I'm having trouble in ExUnit finding where those matchers happen, so if you could please point me in their direction, I'd be happy to look into integrating that functionality wherever it makes sense. At some point, the exercise is going to be improving the many warnings with reasonable output, which feels out of scope of the initial task besides hooking up the pretty printer.

Where I have questions is in how we want to introduce this to users. We need to keep around the dialyzer warnings and at least show them (which will be noisy!) because of having to deal with the interop of .ignore_warnings files. I can see hiding these by default because of the noise, but using them in the background being a good idea? Curious your thoughts. The second thing is if we would want to introduce this pretty printer by default. I think yes, but could be convinced in hiding that behind a flag. In any event, having the ability to have "classic" dialyzer should be exposed to the user. Additionally, since I'm using 1.6 features, I'm not sure how to gate this to earlier versions of Elixir while using the newer features. I could also use some help with the lexer/parser, as I'm sure missing many edge cases, but I've about exhausted my skills in that domain by getting it parsing the code I was throwing at it from e.g. Phoenix.

Having better output from Dialyzer itself would make this process MUCH easier and would alleviate many of the gymnastics I'm having to jump through with this, as I feel I'm in some sort of Rube Goldberg machine.

https://github.com/jeremyjh/dialyxir/compare/master...asummers:elixir-formatter

jeremyjh commented 6 years ago

@asummers thanks for the update! I'll take a deeper look this weekend but just to respond to some of the topics/questions you posed:

I think we'll want this on by default - I think a switch to keep older message format would be worthwhile to help people transition. I don't know its worth doing anything much more to remain compatible with existing ignore files. I could imagine running the old version of warnings through the ignore to see what should be filtered out, but I don't think most people would want to see that output on their screen at all, and it will be confusing if they can (or must) put in ignore lines that don't match what they see on the screen.

Once this is well shaken out its probably as good a time as any to call Dialyxir 1.0, so it will be more clear that there are compatibility issues and I think being 1.6 compatible only is fine as well. I'd take patches for the 0.5 branch for awhile.

OvermindDL1 commented 6 years ago

As I recall, colorizing the diff is done in ExUnit.Formatter.format_assertion_error, so might be able to use that, or at least look at it? :-)

josevalim commented 6 years ago

ExUnit diff uses String.myers_difference. However, I don't think this diff would be relevant to types because you are likely not comparing strings. You need something more semantically aware, such as what we do in Exception.blame_mfa but applied to types.

jeremyjh commented 6 years ago

@asummers I had another look at this today and I find it really encouraging; I'd like to work towards getting it into dialyxir soon; I think what you have already is very valuable. Maybe semantic aware differencing should be a future effort - but the struct parsing alone is already worth it to me.

One thing I noticed is there are a lot more that could be invoked on structs to parse - not just contract failures but also the type argument from pattern match failures may be struct.

Do you want to open a PR where we could collaborate on getting this release ready?

asummers commented 6 years ago

One thing I noticed is there are a lot more that could be invoked on structs to parse - not just contract failures but also the type argument from pattern match failures may be struct.

Do you mean finding the module from the name? That could be nice. I added a utility in there to find the module name from the error message, for use in pretty printing, so that could just do module lookup there. I'll open a PR this evening and try to enumerate the remaining work for what I'd consider a good V1 for this effort.

A question to get the ball rolling -- since the error messages are significantly more verbose now, we should have a "short error" that people can put in their ignore files and such. Should those look like Credo one line errors? Additionally, we should think about having a Flycheck compatible export for use with e.g. Emacs, which could use the same short error. Really there's no reason why there couldn't be an interface that takes in a list of Error structs and exports them as they wish.

jeremyjh commented 6 years ago

One thing I noticed is there are a lot more that could be invoked on structs to parse - not just contract failures but also the type argument from pattern match failures may be struct.

I didn't explain this very well, all I meant is that there are additional messages that may contain structs in the type argument, and so passing these to your parser looks nicer.

For example, I updated this formatter with good results in the case where we access a struct member that doesn't exist:

defp message_to_string({:pattern_match, [pattern, type]}) do
-    "The #{pattern} can never match the type #{type}."
+    "The #{pattern} can never match the type #{pretty_print_contract(type)}."
   end

Regarding ignore files - I think having an option to declare simple strings that match the strings you see on the screen is still a good thing, but yes it would be nice to match something shorter - that would also give us more freedom to improve messages over time without breaking people. I wonder though, if those shorter things should just be a list of predicate functions that are passed the dialyzer error message tuples and output a false to suppress them?

asummers commented 6 years ago

Oh got it! Yeah my issues was a lack of varying Dialyzer errors across different projects to know what the different messages would look like. I can certainly add that specific one, but I don't know a priori have a way to know which errors contain structs without getting real life error output, unfortunately ☹️ I'm glad that it seems to work well in places I didn't engineer it though 😄