ggreer / the_silver_searcher

A code-searching tool similar to ack, but faster.
http://geoff.greer.fm/ag/
Apache License 2.0
26.21k stars 1.43k forks source link

Doesn't respect "match-beginning" gitignore patterns in subdirs #285

Open purcell opened 11 years ago

purcell commented 11 years ago

In .gitignore files, ignore patterns can be prefixed with a "/". man gitignore says:

A leading slash matches the beginning of the pathname. For example, "/*.c" matches "cat-file.c"
but not "mozilla-sha1/sha1.c".

ag currently doesn't apply such patterns when they match files in subdirectories. Steps to reproduce:

% mkdir ag-test                                                                                            % cd ag-test                                                                                               % mkdir ignored
% echo 'hello' > ignored/a
% echo 'hello' > b
% ag hello
b
1:hello

ignored/a
1:hello
% echo '/ignored/*' > .gitignore  ## should ignore "a", but doesn't
% ag hello
b
1:hello

ignored/a
1:hello
% echo 'ignored/*' > .gitignore
% ag hello
b
1:hello

However, the non-subdir case currently works correctly:

% echo '/b' > .gitignore
% ag hello
ignored/a
1:hello

Other than this minor hiccup, it's awesomeness all the way. :-)

mcphail commented 11 years ago

What filesystem are you using to test this? Is this affected by a similar problem to issue #275?

purcell commented 11 years ago

This is on OS X, so HFS+.

While initially attempting to create "steps to reproduce" as above, I certainly noticed something strange, whereby the presence of one file seemed to affect the "searchability" of another, but I unfortunately neglected to note my actions or the exact details. That behaviour does sound related to #275.

mcphail commented 11 years ago

Can you try pull request #284 to see if it helps at all?

purcell commented 11 years ago

Yep, I can hopefully give that a try later today.

purcell commented 11 years ago

Bah, having a non-trivial time getting this to compile here on my Mac. Do you happen to know the magic configure incantation to include the homebrew libs & headers?

mcphail commented 11 years ago

I don't have OSX so I'm not sure I can help. I'd have thought pkg-config would have taken the strain. Have you installed the dependencies from the build section of the README.md and run ./build.sh? Beyond that, ./configure can accept variables like ./configure LDFLAGS='-Lwhatever_lib_dir' LIBS='-lwhatever_lib' but really pkg-config should be doing that for you automatically.

purcell commented 11 years ago

Thanks. No, pkg-config isn't making everything work magically, even though it's the one from homebrew, and all the deps are installed there. I'll have a further tinker with it in the morning: I'd like to be able to give some helpful feedback.

purcell commented 11 years ago

Okay, got it built -- sorry for the noise. So PR #284 doesn't fix this specific problem: the ignored/a file doesn't get ignored. :-(

mcphail commented 11 years ago

Hmm. That's odd. #284 works on Linux on both hfsplus and ntfs filesystems here.

I'm sure this patch is getting close to the nub of the problem. I can't quite get my head around the recursion and appropriate manipulation of scandir_baton->level and ig->parent to do the right thing. I'll keep playing with files until it breaks again so I can debug it further.

mcphail commented 11 years ago

Can you try the current version of #284 and see if it fixes your problem?

purcell commented 11 years ago

I updated to the latest version of the PR (7f4f4d168) and nope, it still doesn't ignore ignored/a in the scenario above.

mcphail commented 11 years ago

Back to the drawing board! To be honest, I'm struggling here. It is working fine on Linux with various filesystems and I don't think I'm going to be able to take it further without running the debugger on a Mac. I wonder if the problem is even more esoteric. Is there a difference in pcre? Perhaps some difference in the Linux and Mac implementations in HFS+? (Or, more likely, my code just stinks...!)

purcell commented 11 years ago

I'll dig into the code myself when I get a chance and see if I can figure things out. Would be nice to have a simple shell-scripted test suite for the various exclusion patterns etc.

mcphail commented 11 years ago

Does the ignore still fail if you remove the asterisks from the .gitignore patterns?

purcell commented 11 years ago

You know, scratch that -- your fix works. I was picking up the wrong ag in my environment.

purcell commented 11 years ago

Sorry.

purcell commented 11 years ago

Now all of the following patterns correctly ignore ignored/a:

/ignored/*
/ignored/
ignored/
ignored/*
/ignored
ignored
a
/a
mcphail commented 11 years ago

Ha! Great.

nebirhos commented 11 years ago

Hi folks! This fix has been published somewhere?

mcphail commented 11 years ago

You can use the code from my pull request #284 but it hasn't been merged into the master branch by @ggreer yet.

nebirhos commented 11 years ago

great, thanks!

raine commented 10 years ago

I assume this hasn't been fixed yet in any released version?

Got slightly confused when ag started acting weird after making a change to gitignore.

More specifically, replaced dir/ with /dir which makes git ignore the path from root only, but ag doesn't see it that way.

deepfire commented 10 years ago

Ok, I just submitted a duplicate bug for this..

purcell commented 10 years ago

Hey @ggreer -- would you mind reviewing #284 at some point please, which is a fix for this issue and the duplicate issue @deepfire just filed & closed? If @mcphail would need to make any further changes in order for the PR to be acceptable, I'm sure he'd be happy to do so. :-)

ctrueden commented 9 years ago

I also filed a duplicate of this issue a while back (#595). FWIW, I wrote a patch to the cram tests that illustrates the problem, and could be handy in verifying a future fix: https://github.com/ctrueden/the_silver_searcher/commit/4f1399723ecb6db069af455b2f37ee38b7e3ddab

cartolari commented 9 years ago

I can also confirm the issue. And another problem that I think it's still related to the .gitignore file is a case where you ignore a entire folder but some files as exemplified by the following hypothetical .gitigore:

ignored/*
!ignored/not_ignored

If I search for some content the file ignored/not_ignored will not be included in the files for the search.

bdesham commented 9 years ago

This fix has still not been merged in, correct?

ctrueden commented 9 years ago

@bdesham I don't think anyone has actually submitted a patch that fixes this... have they?

bdesham commented 9 years ago

@ctrueden After reading through this thread again, I guess you’re right. PR #284 fixed this but @mcphail closed it because it went so long without being merged :neutral_face:

mcphail commented 9 years ago

I'm afraid there have been layers of breakage added to the ignore code since I made that patch, so there is no way I could keep it rebased against upstream. You can use my fork if you want to try the code. It isn't perfect, but I'm sure some enterprising soul could fix the edge cases where it breaks.

mbaer3000 commented 8 years ago

Can confirm the issue. It's a really painful having ag go through dist assets -- which are commonly exempted from version control via "/dist" in the .gitignore file.