Closed CapnKernel closed 5 years ago
The man page specifies that nested directories (eg. foo/bar) are not supported by --ignore-dir. I think the failure here is that ack didn't complain when you told it to use one, though!
I was thinking about implementation methods (sorry I don't do Perl). One way is to take advantage of the fact that for every filesystem entity, the tuple of (majordev, minordev, inode) is guaranteed to be unique. And looking up those three things is very cheap, stat gives them to you. So code could take the arguments to ignore-dir (whether they be relative or absolute), convert them into tuples, then store them in a hash. Then during traversal, for each entity, take the tuple and see if it's in the hash. Quick, foolproof.
Really really would like support for absolute paths, as I have a customer who is begging for ack but isn't savvy enough to keep constructing relative paths.
Seems I misunderstood what --ignore-dir is for. Seems it's for ignoring classes of dirs such as CVS, rather than a particular location.
Is there a way to tell ack to ignore a particular location?
For example, I have a subdir containing 1m+ files, I don't want ack looking in there. Another scenario: NFS automount, and/or SSH fuse filesystems off to machines on another continent. We don't want ack to crawl through the filesystems of every machine on the network or to go interncontinental...
No, ack does not recognize absolute pathnames.
Related: #330
I'm sure this has been rehashed — probably to death— but I just wanted to add my two cents...
(FTR, I'll say right up front that ack
is one of my absolute favorite examples of "Look at Perl go!" I ALWAYS feel a nice little hit of adrenaline when my brain realizes, "OOH! This looks like a job for ack!" It feels so "5.20" even though it still runs on "5.8.8", WOW!)
I do, however, agree with @CapnKernel++ that the current --ignore-dir
implementation violates the Principle of Least Astonishment. Speaking for myself, at least, I had expected it to behave like rsync's --exclude
options. To wit:
prompt% rsync -a / remote:/backup/root/ --exclude=/dev
will skip [root-level] device files, but still transfer my $HOME/dev
development projects directory. Similarly, I would have expected:
prompt% ack someText --ignore-dir=/man
to skip the manpages directly under my current directory, but still find every other "man", "woman", and "child" subdirectory elsewhere in the hierarchy.
What did catch me off guard, however (and this is the actual impetus for me bothering to bring up the topic again) was that even though I have the following:
prompt% cat ./.ackrc
--ignore-directory=is:dist
this still fails:
prompt% ack ERROR_666 dist/staging/errorMessages/
But cd dist; ack ERROR_666 staging/errorMessages/
works fine!
My (un-enlightented) assumption was that a directory explicitly listed on the command line would trump any --ignore-dir
settings in my .ackrc
file(s), just like it does for files:
However, ack always searches the files given on the command line, no matter what
type. If you tell ack to search in a coredump, it will search in a coredump.
I was especially astonished that adding --noignore-dir=dist/staging/errorMessages
didn't help... Yes, after spending some poring over the documentation and searching Google (which brought me here) I totally understand why, but I know not everybody has the patience to do the same. «sigh»
Although the code below is brittle, and most definitely a hack (it doesn't handle trailing arguments or multiple directories, and it strips off the directory prefix, etc.) the latter case made me whip up the following shell function:
ack() {
perl -e '$ARGV[-1] = "." if -d $ARGV[-1] && chdir $ARGV[-1];
exec "ack", @ARGV or die "ack: $!\n"' -- "$@"
}
And I can always command ack {...}
if necessary.
PPS: Lastly, one other incorrect assumption I made was thinking that ack
would look for an .ackrc
file in EVERY subdirectory, merging in those settings accordingly — similar to .gitignore
or rsync -F
filter files.
Again, I admit that was unfounded, but nevertheless, it's the kind of feature I would expect to see in an all-singing, all-dancing recursive über search tool — and I can't imagine I'm the only one that thinks that way... I just haven't found the right ticket yet. (#273, perhaps?)
Anyway, like I said at the top, these are all very, very tiny nits compared to the 99.998% awesome-sauce that is ack
... I'm just trying to help it go from an A to an A+!
Keep up the excellent work @petdance++ and @hoelzro++!! :-D
Yes please. Some way to exclude files based on paths, not just the last path element, would be a fantastic addition.
(Like e.g. the explicitly mentioned unsupported foo/bar
from the man page)
I'm trying out ack 2.04. I have some subdirs that contain 1m+ files that I don't want ack to search. --ignore-dir works if I give it relative dirs, but not with absolute pathnames. For example, this finishes in 6 seconds:
ack --ignore-dir={userimage,skins,catalog,detailed_image,var,var220,images} -w db_num_rows
But this is still going after 20 minutes:
ack --ignore-dir=$PWD/{userimage,skins,catalog,detailed_image,var,var220,images} -w db_num_rows
It would seem that ack isn't smart enough to match up an absolute pathname to a relative pathname.
The reason I want to use absolute pathnames is that I may be somewhere "nearby", filesystem-wise, but I still want ack to ignore these dirs if a search would go into there. And I want to put these ignorable dirs into the .ackrc file.
(It could also be me, in which case accept my apologies, but if so, this does seem to violate the Principle of Least Surprise)