Open jeanas opened 4 months ago
I would like to work on this issue.. Can anyone help me how should I get started ?
I haven't looked much at the code, but I think I get what's happening.
The help says
WARNING: Because of how the underlying regex engine works, multiline
searches may be slower than normal line-oriented searches, and they may
also use more memory. In particular, when multiline mode is enabled,
ripgrep requires that each file it searches is laid out contiguously in
memory (either by reading it onto the heap or by memory-mapping it).
Things that cannot be memory-mapped (such as stdin) will be consumed
until EOF before searching can begin. In general, ripgrep will only do
these things when necessary. Specifically, if the -U/--multiline flag
is provided but the regex does not contain patterns that would match \n
characters, then ripgrep will automatically avoid reading each file
into memory before searching it. Nevertheless, if you only care about
matches spanning at most one line, then it is always better to disable
multiline mode.
And sure enough:
$ cat file.txt
start end
start end
$ rg --count-matches "start|end" file.txt
4
$ rg --count "start|end" file.txt
2
$ rg --count --multiline "start|end" file.txt
2
$ rg --count --multiline "^start|end$" file.txt
4
In words: when the regex doesn't contain ^
or $
, ripgrep notices that multiline mode is useless and runs the normal, non-multiline mode, but then this changes the semantics of --count
.
Could it be related to #2779?
Please tick this box to confirm you have reviewed the above.
What version of ripgrep are you using?
14.1.0
How did you install ripgrep?
Reproduces with distro package and
cargo install
ed version.What operating system are you using ripgrep on?
Fedora 40
Describe your bug.
From the
--help
output:However, the behavior I'm seeing is that
--count
still behaves as "count the number matching lines" and not as "count the number of matches" even under multiline mode.What are the steps to reproduce the behavior?
What is the actual behavior?
What is the expected behavior?
The last command should print 8 (or the documentation should be changed).