elixir-lang / elixir

Elixir is a dynamic, functional language for building scalable and maintainable applications
https://elixir-lang.org/
Apache License 2.0
24.54k stars 3.38k forks source link

Revisit usage of diff with =~ in assertions #4862

Closed josevalim closed 7 years ago

matisojka commented 7 years ago

If possible, I'd like to help with this issue. For that I'd need to know a bit more about what needs to be done. Any inputs @josevalim or @lexmag?

lexmag commented 7 years ago

@yagooar this issue is about finding a way to improve difference highlighting when =~ is used, for example:

string1 = "\e[31m\n16:16:53.230 [error] GenStage consumer cannot dispatch events (an empty list must be returned): [:f, :g, :h]\n\n\e[0m"
string2 = "GenStage producer has discarded 3 events from buffer"

will give:

screen shot 2017-05-09 at 21 52 01

The output is the same as if we would use the == operator: it marks sections outside =~ matching as "deleted" (red), while they ideally should be regular color.

  1. Since =~ has left and right sides defined, we can possibly try to take a slice from left side to run difference with right on that slice. How to do that slice—that's the issue.
  2. Alternatively we should stop running difference when =~ is used.
  3. Or keep it as it is today.

:bowtie:

josevalim commented 7 years ago

The problem with running the difference is that it runs it over the whole left string and that will always be misleading as we never want to compare the whole two full strings. You can see this in your snippet from the fact it is picking up letter by letter in the diff.

One possible solution is to find the longest submatch and diff only that, but that's expensive in terms of implementation and time complexity. So right now I would propose 2.

OvermindDL1 commented 7 years ago

I actually quite like the difference it shows. It's caught some error messages where I was expecting one thing but got another that was off by only a single character (line number specifically), and it makes it obvious in those cases.

josevalim commented 7 years ago

@OvermindDL1 Are you saying that in particular to =~? If so, do you have any examples? In all errors reported because of =~ I can't recall a single situation where it actually helped. There is always too much red, as in @lexmag's example above.

OvermindDL1 commented 7 years ago

Sure, let me make my bug again and show the output (I'm using solarized dark as my color scheme so 'green' is more 'grayish blue'): image Which once the line number was fixed then it passed, without me needing to test the exception value (that is tested elsewhere as a value instead of the message).

Could always just make it an option to ExUnit too, default off or so?

josevalim commented 7 years ago

@lexmag are we consider the string distance before computing the diff for =~?

josevalim commented 7 years ago

@OvermindDL1 and thank you for the useful example! :D

lexmag commented 7 years ago

@josevalim yes, my example is on the edge of acceptable threshold. Perhaps we can have different thresholds for =~ and ==, having much lower for =~. But it still won't solve red coloring "problem".

josevalim commented 7 years ago

Closing this for now. We do not have a good answer and it seems discarding the diff means we have more to lose than to gain.