dandavison / delta

A syntax-highlighting pager for git, diff, grep, and blame output
https://dandavison.github.io/delta/
MIT License
21.34k stars 359 forks source link

Fix headers of modified binary files #1629

Closed imbrish closed 4 months ago

imbrish commented 4 months ago

All tests pass, but as this is my first time working with Rust and my first contribution here, it would be great if someone more familiar with the codebase could take a closer look 😉

The diff:

diff --git a/foo b/bar
similarity index 100%
rename from foo
rename to bar
diff --git a/other b/other
index 00de669..d47cd84 100644
Binary files a/other and b/other differ

Now:

image

When merged:

image

Fixes #1621.

imbrish commented 4 months ago

Hey @dandavison,

I have reviewed both #945 and #1502 which touched the same code, made slight modifications, and squashed the commits.

The main fix was to pre-fill the header fields (minus and plus files etc.) already when at the diff line in diff_header_diff.rs These fields are later updated precisely on actual header minus and header plus lines, but it fixes the header for modified binary files, which do not have the minus and plus lines. This is now also described in the code comment.

Another change was a simplification of diff_header_misc.rs, which now correctly adds the (binary file) suffix also for the modified binary file diffs. I presume the abovementioned omission was the reason it bailed when both filenames were empty, and this is no longer necessary.

I have also updated test_example_diffs.rs to include the handled cases and make the related tests more reliable.

One caveat is that the get_repeated_file_path_from_diff_line function returns the path only when the a/ and b/ are identical. But I do not think that will be a problem, because in my tests this is always the case apart from renames, which come with their own rename from and rename to lines.

imbrish commented 4 months ago

I have now realized the reason for a special treatment of a case with both minus and plus files empty in diff_header_misc.rs. It seems to be parsing of the output of standalone diff, but this was missing tests. I have decided to handle it slightly differently and pass through the Binary files ... differ line verbatim, to align output with the Files ... are identical line.

Before:

image

After:

image

This is now ready to merge! 🚀

imbrish commented 4 months ago

@dandavison noticed you started the checks, but I have found an omission for git diff --no-index, will fix it in a minute.

imbrish commented 4 months ago

Done! Here is what was fixed.

Calling git diff --no-index --no-prefix "foo bar" "sub dir/foo bar baz" generates a following diff:

diff --git foo bar sub dir/foo bar baz
index 329fbf5..481817c 100644
Binary files foo bar and sub dir/foo bar baz differ

There is no reliable way to determine the minus and plus files in such cases, and thus I have decided to align the output here with that of standalone diff as described above:

image

When the a/ and b/ markers are present the files could be recognized in diff_header_diff.rs, but as this is such a minor case, let us leave that for another day, and not delay this pull request any further 👍

dandavison commented 4 months ago

Excellent! Thanks very much for this work @imbrish.