twaugh / patchutils

Manipulate patch files
GNU General Public License v2.0
139 stars 22 forks source link

grepdiff returns wrong line numbers for `--output-matching=hunk` #55

Open jorhett opened 2 years ago

jorhett commented 2 years ago

Grepdiff is calculating the wrong lines when using the --as-numbered-lines output

Recreation scenario

$ grepdiff --version
grepdiff - patchutils version 0.4.2

I've minimized this to as small of a document as demonstrates the issue. Start with an empty repo and original text

commit 2b1c59cab3215fc7a8632d73522b5ab7399a48b4 (HEAD -> master)
Author: Jo Rhett
Date:   Wed Nov 2 15:06:20 2022 -0700

    Initial text

diff --git a/foobar.txt b/foobar.txt
new file mode 100644
index 0000000..8a19eed
--- /dev/null
+++ b/foobar.txt
@@ -0,0 +1,8 @@
+#
+#
+# a comment
+#
+#
+
+# description
+foo()

Now modify this file to insert additional lines (such that the line numbers would adjust) and adjust a target function you'd be searching for.

$ git diff
diff --git a/foobar.txt b/foobar.txt
index 8a19eed..ae745cf 100644
--- a/foobar.txt
+++ b/foobar.txt
@@ -1,3 +1,7 @@
+# pre-comment
+# spread over
+# three lines
+
 #
 #
 # a comment
@@ -5,4 +9,4 @@
 #

 # description
-foo()
+foo(bar)

$ git diff HEAD --no-color --no-ext-diff --unified=0 --exit-code --text
diff --git a/foobar.txt b/foobar.txt
index 8a19eed..ae745cf 100644
--- a/foobar.txt
+++ b/foobar.txt
@@ -0,0 +1,4 @@
+# pre-comment
+# spread over
+# three lines
+
@@ -8 +12 @@
-foo()
+foo(bar)

Observe that the line counts are correct in both outputs.

See grepdiff break the lines

The line count issue first appears with grepdiff --output-matching=hunk

$ git diff HEAD --no-color --no-ext-diff --unified=0 --exit-code --text | grepdiff foo --output-matching=hunk
diff --git a/foobar.txt b/foobar.txt
index 8a19eed..ae745cf 100644
--- a/foobar.txt
+++ b/foobar.txt
@@ -8 +8 @@
-foo()
+foo(bar)

It would appear from casual review to be duplicating the original line number and using it for the final line number too. It should be line 12 in the revised text. This pattern holds true with much larger files with many matches, fwiw.

And likewise, it shows the wrong output with --as-numbered-lines:

$ git diff HEAD --no-color --no-ext-diff --unified=0 --exit-code --text | grepdiff foo --output-matching=hunk --only-match=additions  --as-numbered-lines=before
diff --git a/foobar.txt b/foobar.txt
index 8a19eed..ae745cf 100644
--- a/foobar.txt
8   :foo()

$ git diff HEAD --no-color --no-ext-diff --unified=0 --exit-code --text | grepdiff foo --output-matching=hunk --only-match=additions  --as-numbered-lines=after
diff --git a/foobar.txt b/foobar.txt
index 8a19eed..ae745cf 100644
+++ b/foobar.txt
8   :foo(bar)

Use case

I'm trying to use grepdiff for CI change analysis, and it appears ideal for this scenario. In fact it works perfectly, but is placing the messages on the wrong lines:

# compare to see if the pattern is in the added lines
git diff origin/${{ github.base_ref }} --no-ext-diff --no-color --unified --text --no-prefix $file | \
grepdiff "${pattern}" --output-matching=hunk --only-match=additions --as-numbered-lines=after | \
grep -e "${pattern}" | \
while read -r get_line; do
    linenum=$(echo $get_line| cut -d: -f1)
    echo "::error file=${file},line=${linenum}::${obsolete_patterns[$pattern]}"
done
sergiomb2 commented 2 years ago

you mean ? from

@@ -8 +12 @@
-foo()
+foo(bar)

with grepdiff foo --output-matching=hunk show

@@ -8 +8 @@
-foo()
+foo(bar)