beyondgrep / ack2

**ack 2 is no longer being maintained. ack 3 is the latest version.**
https://github.com/beyondgrep/ack3/
Other
1.48k stars 138 forks source link

ack -c giving zero count #623

Closed Smylers closed 4 years ago

Smylers commented 7 years ago

ack | wc -l is giving me 110 lines, but ack -c is counting 0.

I don't think this is Issue #563, cos that's specific to overcounting matches, which clearly doesn't apply here.

$ ack  '^(?=.*(\w)(?!\1)(\w)\2\1)(?!.*\[[^]]*(\w)(?!\3)(\w)\4\3)' ack_zero_count_input.txt | wc -l
110

$ ack -c '^(?=.*(\w)(?!\1)(\w)\2\1)(?!.*\[[^]]*(\w)(?!\3)(\w)\4\3)' ack_zero_count_input.txt
0

Here's the input file: ack_zero_count_input.txt

I haven't yet tried to minimize this, to narrow down what about that pattern or input is causing this. -c is still working fine for me in other situations. This is ack 2.14.

petdance commented 7 years ago

LIkely related to #563

Smylers commented 7 years ago

Much smaller example. It seems to be that -c gives 0 if the pattern has a ^ or $ anchor:

$ ack -c x /usr/share/dict/words
2124
$ ack -c ^x /usr/share/dict/words
0
$ ack ^x /usr/share/dict/words | wc -l
53
$ ack -c x$ /usr/share/dict/words
0
$ ack x$ /usr/share/dict/words | wc -l
195
n1vux commented 7 years ago

Smylers, can you confirm or deny that your difficulty with -c ^carret is on either MS Win or Mac OS X ?

n1vux commented 7 years ago

from ack-users , OnlineCop onlinecop@gmail.com 10:18 PM (15 hours ago)

TL;DR When --count or --files-with-matches are used in conjunction with the ^ caret, no matches are returned. This was tested on Windows 7 and OS X using ack version 2.14.

I've got several .h files that have lines starting with the word 'class': class FSomeDerived : public FSomeBase {...}; class FAnotherDerived : public FSomeBase {...};

I also have multiple lines that contain the word 'class' somewhere within it, but that are not at the beginning of the line. On Windows, these .h files have CRLF (\r\n) line endings, and on OS X, the same .h files have only LF (\n) line endings.

ack --hh '^class' correctly displays the .h files and full lines where 'class' is found at column 0 of the line. ack --hh --count '^class' shows all .h files that had been found and were searched, but all showed counts of :0 even if they contain 'class' at column 0. ack --hh --files-with-matches '^class' shows no results at all.

ack --hh '^' shows every line in every .h file, as it matches the beginning of ALL lines. ack --hh --count '^' shows a count of :1 for every .h file.

Environments tested: Windows 7, running a MinGW32 terminal, ack version 2.14: ~/bin/ack --hh '^class' Windows 7, running command prompt "cmd", ack version 2.14: perl %homepath%\bin\ack --hh "^class" OS X 10.11.6 (El Capitan), ack version 2.14: ~/bin/ack --hh '^class'

From my tests, I can use the ^ caret without --count or --files-with-matches but I cannot combine them.

Do I need to change something in how I'm testing for a pattern that begins at column 0 of these .h files?

Andy replied

No, it's clearly something wrong with how we're handling line-endings. Would you like to submit this as an issue or should I?

to which I (bill.n1vux) added

ack --hh --count '^' shows a count of :1 for every .h file. Ooh, that is very helpful. Thank you for including that one.

I'm assuming normal perls for Win7 and Mac OSX; if you were using "Cygwin Perl" on WIN7 you'd have mentioned it.

Additional diagnostic questions for your Win7 and Mac OSX that may help us find the issue in our line-endings interaction with --count ( and might provide you a workaround until we fix it. )

Does behavior from --count '^class' (or -l) with any of

'\Aclass' '(?m)^class' '^\s*class' ''\A\s*class'

and if so, which give correct results, or worse ?

n1vux commented 7 years ago

Combined evidence of

ack --hh --count '^' shows a count of :1 for every .h file.

and my inability to reproduce problem on Linux seems to confirm Andy's intuition that this is a line-ending thing (and contradict first assumption of being related to #563)

n1vux commented 7 years ago

should this bug have Windows and Mac labels ? (Mac doesn't exist yet but should)

Smylers commented 7 years ago

Smylers, can you confirm or deny that your difficulty with -c ^carret is on either MS Win or Mac OS X ?

It was on Linux.

I initially filed this separately from Issue #563, because that was overcounting (counting multiple matches per line) and this was not counting at all. However, after Andy said he thought they were related, I tried a fix for #563 that somebody had suggested (presumable the one in #574), and that did actually work for this issue too.

Looking at the #574 patch again now, I suspect that the problem I encountered is that in the counting line the match doesn't have the /m modifier, so the '^' would only ever match at the very start of the file:

$nmatches =()= ($content =~ /$opt_regex/og);

However, since the #574 patch fixes #563 by matching each line separately, that obviates the need for /m, so that approach happens to fix this issue too.

My apologies for not sharing this at the time I discovered it. I hope that hasn't wasted too much of anybody's time.

n1vux commented 7 years ago

ok, so we had two different bugs with same symptom (-c ^zero), one that could do this anywhere and one that is likely platform / line-ending specific. We may need to test the #574 patch on Win & Mac to see if it cured or caused the line ending problem.

petdance commented 4 years ago

This seems to be working OK on ack 3.

Smylers commented 4 years ago

For the record, it was also fixed in ack 2 by 17504aa3.