Observed with a benchmark 100,000,000 bytes enwik8 file to search the word the to output inverted matches -v with "after context" -A1:
ugrep -vA1 -n the enwik8 | wc
1114216 12665271 100352570
The correct output should be:
ugrep -vA1 -n the enwik8 | wc
1114310 12671469 100396462
The problem may happen with very large files with a high match count for the patterns specified, such as the word the in the large enwik8 Wikipedia file. An internal buffer shift adds 1 to a line number counter in function begin_before(), which is called by the InvertContextGrepHandler() functor that is triggered by at the buffer shift. This counts up one too many lineno when InvertContextGrepHandler() is also used to output context at the same time. This causes a missed line in the output.
Note: Fixed in the latest commit of v4.3.3-1. The output is now exactly the same byte-for-byte as GNU grep 3.11.
Observed with a benchmark 100,000,000 bytes enwik8 file to search the word
the
to output inverted matches-v
with "after context"-A1
:The correct output should be:
The problem may happen with very large files with a high match count for the patterns specified, such as the word
the
in the largeenwik8
Wikipedia file. An internal buffer shift adds 1 to a line number counter in functionbegin_before()
, which is called by theInvertContextGrepHandler()
functor that is triggered by at the buffer shift. This counts up one too manylineno
whenInvertContextGrepHandler()
is also used to output context at the same time. This causes a missed line in the output.Note: Fixed in the latest commit of v4.3.3-1. The output is now exactly the same byte-for-byte as GNU grep 3.11.