walles / riff

A diff filter highlighting which line parts have changed
MIT License
253 stars 5 forks source link

Unexpected highlight #14

Open walles opened 3 years ago

walles commented 3 years ago

The Problem

black is highlighted even though I think it shouldn't. Either explain this or fix it!

why-is-black-highlighted

The diff

--- /tmp/pirate/pirate-ipsum-before.txt 2020-12-31 00:11:15.000000000 +0100
+++ /tmp/pirate/pirate-ipsum-after.txt  2020-12-31 00:11:46.000000000 +0100
@@ -1,6 +1,7 @@
-Hornswaggle knave coffer rum Nelsons folly bilge water lugger. Fire in the hole black
-spot knave come about jury mast coxswain rutters. Keelhaul hail-shot Jack Ketch no prey,
-no pay gunwalls gaff haul wind. Ho fire in the hole Sail ho booty rum trysail hail-shot.
-Knave Letter of Marque barkadeer league mizzen strike colors spike. Jack Ketch spirits
-hail-shot long clothes walk the plank gabion warp. Poop deck holystone black spot tackle
-long boat loot run a shot across the bow.
+Hornswaggle knave coffer rum Nelsons folly bilge water lugger. Fire in the hole
+black spot knave come about jury mast coxswain rutters. Keelhaul hail-shot Jack
+Ketch no prey, no pay gunwalls gaff haul wind. Ho fire in the hole Sail ho booty
+rum trysail hail-shot. Knave Letter of Marque barkadeer league mizzen strike
+colors spike. Jack Ketch spirits hail-shot long clothes walk the plank gabion
+warp. Poop deck holystone black spot tackle long boat loot run a shot across the
+bow.
walles commented 3 years ago

git bisect blames 6243516b7de4ba19157cb8a5ed4988539b0d339d:

6243516b7de4ba19157cb8a5ed4988539b0d339d is the first bad commit
commit 6243516b7de4ba19157cb8a5ed4988539b0d339d
Author: Johan Walles <johan.walles@gmail.com>
Date:   Wed Nov 11 23:46:04 2020 +0100

    Refine by word rather than by character

    Hello #7.

 README.md      | 10 ++++++----
 src/refiner.rs | 36 +++++++++++++++++++++---------------
 2 files changed, 27 insertions(+), 19 deletions(-)
walles commented 3 years ago

Smaller test case:

--- /tmp/pirate/pirate-ipsum-before.txt 2020-12-31 00:11:15.000000000 +0100
+++ /tmp/pirate/pirate-ipsum-after.txt  2020-12-31 00:11:46.000000000 +0100
@@ -1,6 +1,7 @@
-the hole black
-spot
+the hole
+black spot
walles commented 3 years ago

Explanation

  old: the _ hole  _ black \n spot \n
johan:             ^       ^^
 riff:             ^ ^^^^^

  new: the _ hole \n black  _ spot \n
johan:            ^^        ^
 riff:               ^^^^^  ^

So I see this as a space and a newline switched places.

And riff thinks that in this case, black (note leading space) has been removed from the first line, and black (note trailing space) has been added to the second line.

Given that riff works with words rather than characters (see #7), riff's solution contains just as many changes as mine, and both are valid.

I still don't like it though.

walles commented 3 years ago

I think the correct solution to this would be to say that highlighting words is more expensive than highlighting non-words.

But that would require support in https://github.com/distil/diffus which is not there right now.