BurntSushi / ripgrep

ripgrep recursively searches directories for a regex pattern while respecting your gitignore
The Unlicense
46.19k stars 1.93k forks source link

Searching filenames (`-g` option in `ag`) #91

Closed Valloric closed 7 years ago

Valloric commented 7 years ago

Ag supports using -g to recursively search filenames matching the provided pattern. Ripgrep doesn't have this feature (to the best of my knowledge; I've read the --help output three times looking for it); it's the only ag feature holding me back from fully switching to ripgrep. :)

Since ripgrep already has something unrelated mapped to -g, the option name should be something else. The incompatibility with ag will be unfortunate, but survivable.

BurntSushi commented 7 years ago

I'm pretty sure this has been reported at least several times now. :-) rg has --files for listing files and -g for glob matching.

Valloric commented 7 years ago

I'm pretty sure this has been reported at least several times now. :-)

Sorry about that! It might be indicative of a documentation "bug"; it seems people are having trouble locating this feature.

Valloric commented 7 years ago

Ah, I get it now. It's a combination of two flags: --files lists all the files that would have been searched and -g can be used to only search files matching a pattern.

This is very non-obvious. It requires users to fully understand two separate, unrelated options and to realize that when combined, they can be used to solve their problem.

Since ripgrep will have lots of users coming from ag who will expect a simple single-flag solution, it might be a good idea to have a single flag (much like ag) for this use-case, if for any reason, than to save you the headache of closing all these issue reports. :)

BurntSushi commented 7 years ago

I'd like to solve this with better documentation. It's clear this is an important feature to many, but it's decidedly subservient to ripgrep's primary focus, so I'd like to avoid adding extra flags for it.

gabrielmagno commented 7 years ago

That would be a great feature. But it is important to notice the differences.

Let's say I have this directory tree:

.
└── path
    └── to
        ├── file
        │   ├── pattern (file)
        │   └── pattern.txt (file)
        ├── pattern
        │   └── fileA.txt (file)
        └── patterns
            └── fileB.txt (file)
$ ag -g "pattern"
path/to/pattern/fileA.txt
path/to/patterns/fileB.txt
path/to/file/pattern
path/to/file/pattern.txt
$ rg --files -g "pattern"
path/to/file/pattern

So, ideally, if such a feature is implemented in ripgrep to reproduce what The Silver Search does, it should check the path (not only the name) and also match any part of the string (not an exact match, not a glob).

BurntSushi commented 7 years ago

Remember that -g gives you the ability to use a glob, so if you want to match any component of the path, then **/pattern/** will work. (Well, to match pattern.txt and path/to/patterns, you will actually need **/pattern*/**.)

nateozem commented 7 years ago

In case somebody wouldn't know the syntax for doing a search with the condition of excluding files with a certain name using the command line, here is an example to do so:

$ rg "from_str" -g "\!tags" -g "repos/*"
                    ^^^

As shown, you may need to escape ! with \, if wish to exclude files from search.

BurntSushi commented 7 years ago

Or just use single quotes.

On Dec 26, 2016 11:56 PM, "nateozem" notifications@github.com wrote:

In case somebody wouldn't know the syntax for doing a search with the condition of excluding files with a certain name using the command line, here is an example to do so:

$ rg "from_str" -g "!tags" -g "repos/*" ^^^

As shown, you may need to escape ! with \, if wish to exclude files from search.

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/BurntSushi/ripgrep/issues/91#issuecomment-269269896, or mute the thread https://github.com/notifications/unsubscribe-auth/AAb34sr7aMsWRt2-EZ0DUTcu_0dCcxmwks5rMJqIgaJpZM4KGBgW .

h5rdly commented 7 years ago

Are there search parameters for rg that would list any files that match pattern, but any folder that matches pattern will only appear by itself, aka not relisted with everything inside it?

Say, for 'python' I'd like to get - c:/python27/ c:/python27/python.exe c:/python27/pythonw.exe c:/pdfs/pythonconf.pdf .[more such stuff] .

Is that possible directly, without parsing the output?

BurntSushi commented 7 years ago

No. ripgrep will never print directories, only file paths. ripgrep isn't a find replacement.

Note that there's enough of libripgrep done that you could write your own CLI tool in Rust fairly easily to do this. I would be happy to mentor that project.

blueyed commented 7 years ago

Note that --files also includes binary files (is this a bug?) The help says:

Print each file that would be searched without actually performing the search.

So to replicate ag -g . I am using rg --files-with-matches . now. A difference here is that rg does not include empty files like ag does, even when using '' or '.?' for the pattern.

BurntSushi commented 7 years ago

Note that --files also includes binary files (is this a bug?)

No. "binary" files isn't a fact one can know without actually searching the file contents.

So to replicate ag -g . I am using rg --files-with-matches .

Those aren't the same. ag -g . is like rg --files. rg --files-with-matches is like ag --files-with-matches.

A difference here is that rg does not include empty files like ag does, even when using '' or '.?' for the pattern.

Whether a file is empty or not has no impact on the output of --files. In an empty directory:

$ touch hi
$ rg --files
hi
blueyed commented 7 years ago

@BurntSushi Thanks for your reply. I got confused because I had a global .agignore file that excluded *.png, and thought that ag -g . would exclude binary files therefore already.

Whether a file is empty or not has no impact on the output of --files

I was referring to --files-with-matches here, which I think is better to exclude binary files (I am using rg here to list files for ctags). I think it's good to not include empty files here, but still wonder if you could have a pattern that is always true, also for empty files (e.g. ^ or .?)?

BurntSushi commented 7 years ago

I was referring to --files-with-matches here, which I think is better to exclude binary files (I am using rg here to list files for ctags).

It does:

$ echo -e 'z\x00' > test
$ rg --files-with-matches z
$ rg --files-with-matches -a z
test

Please note that "binary" detection cannot possibly be perfect. It uses a simple heuristic: if there's a NUL byte, then the file is considered binary. This is what GNU grep does. (ripgrep does have a bug here, see #52.)

I think it's good to not include empty files here, but still wonder if you could have a pattern that is always true, also for empty files (e.g. ^ or .?)?

ripgrep is a line oriented searcher. If a file doesn't contain any lines, then there's nothing to match. What is your use case?

In the future, could you please open new issues for new bug reports/feature requests?

blueyed commented 7 years ago

I was referring to --files-with-matches here, which I think is better to exclude binary files (I am using rg here to list files for ctags).

It does:

And that is good.

ripgrep is a line oriented searcher. If a file doesn't contain any lines, then there's nothing to match.

I see, but somehow thought that ^ should be always true.

But grep -l '^' * behaves the same, it makes sense, and it is a good way to skip empty files (using ., which would work even when ^ would stand for "anything").

In the future, could you please open new issues for new bug reports/feature requests?

I get you, but it is still about mimicking ag -g (and ag -g . in particular), isn't it?

What is your use case?

(I am using rg here to list files for ctags).

I was using ag -g . | ctags --links=no -L- to generate tags for my dotfiles (a globally included tags file).

I've replaced it now with rg --files-with-matches --no-ignore-vcs . | ctags --links=no -L-.

BurntSushi commented 7 years ago

I get you, but it is still about mimicking ag -g (and ag -g . in particular), isn't it?

Definitely not. I think there is a lot of value in having a similar interface as other tools, but I've never added something to ripgrep just for the sake of mimicking another tool. :-)

In the future, please open new issues to discuss new features or bugs. This particular issue is closed and done with. If there's a problem, we should start a new discussion.

I've replaced it now with rg --files-with-matches --no-ignore-vcs . | ctags --links=no -L-.

Why does this require printing empty files? Surely you won't be able to jump to any symbol defined in an empty file. ;-)

blueyed commented 7 years ago

@BurntSushi I do not want the empty files - it's a nice side effect of the mimicking / workaround though.

about mimicking ag -g (and ag -g . in particular), isn't it?

Definitely not. I think there is a lot of value in having a similar interface as other tools, but I've never added something to ripgrep just for the sake of mimicking another tool. :-)

Sure, but then this issue is still the place to come to when you want to simulate it - that's what I've meant.

All is fine from my side.. :) Thanks!

BurntSushi commented 7 years ago

@blueyed I see, OK. I had a really hard time understanding what you were trying to say! Glad all is well.

blueyed commented 7 years ago

@BurntSushi Thanks for a faster and bug-free (with regard to the .gitignore handling) replacement for ag!

Valloric commented 7 years ago

Seems like I misunderstood what -g did originally; it's for a glob match, not a regex match.

Given that, I can't say that rg --files -g 'pattern' is adequate. What I'd like to see as a user is a regex match on the filenames.

mcandre commented 6 years ago

Is there no combination of flags for ripgrep allowing users to constrain both the filename and the content simultaneously? I'd like to search for Makefile's with "\$\$" in them, for example. Currently working around this with find . -name Makefile -exec grep "\\$\\$" {} \;

BurntSushi commented 6 years ago

@mcandre What have you tried? If rg -F '$$' searches too much, then restrict it using exactly the techniques outlined above: rg -F '$$' -g Makefile.

mcandre commented 6 years ago

@BurntSushi Thanks for the tip, works like a charm!

I wonder why the ripgrep CLI parsing doesn't treat

rg -g Makefile '$$'

as

rg -g Makefile -F '$$'

Oh well, at least the latter works on my machine^TM.

okdana commented 6 years ago

Because $ is a regex meta-character used to match the end of the line; -F escapes all meta-characters so that they're treated literally, which is what you wanted in this case

Might be worth reviewing something like http://www.regular-expressions.info/quickstart.html

mcandre commented 6 years ago

@okdana Lol I totally forgot about that! Yeah, -F '$$' works on my machine, and I suppose '\$\$' would work as well.

rosshadden commented 6 years ago

@BurntSushi I apologize if you find these kinds of issues annoying, but clearly based on your comments in some of them you understand that a lot of us want this behavior. Your focus up to this point has been on showing that ripgrep already has the behavior, but I think you should heavily consider focusing on making it as easy as it is in ag, pt, and ack.

Using -g in all of those tools is not just a gimmick, it's incredibly useful. In fact I use it just about as much as I do the main functionality. A lot of projects like fzf, unite.vim, and denite.vim even officially recommend using this feature for getting lists of files, as -g '' is a really convenient and fast git-agnostic way to do git ls-files. In fact doing a search for " -g" on fzf's readme shows five occurrences of using ag -g in different ways.

To be clear I'm not proposing you make it -g. But please consider adding a flag that does the equivalent. It will shut all of us up and make us feel much better about jumping into using and loving ripgrep. Thanks!

BurntSushi commented 6 years ago

@rosshadden What are you actually asking for? The -g flag's behavior is already set and it isn't changing. Are you asking for a new flag that combines -g and --files? If so, just define an alias?

alias rgg="rg --files"
rosshadden commented 6 years ago

Yes I was asking for a flag that combines them. I do have an alias for this (called rgg even, haha), but feel like it should be an official flag shortcut. It just felt like an elephant in the room sort of thing so I brought it up. A lot of us end up here or in the other duplicate issues for this exact reason, and while yes having it documented helps, I still think you should consider making it easier and more comfortable for people.

Anyway that's all I'll say about it, and I'll leave you alone now. Thanks again for the project. I never thought I'd move off ack until I found ag. Never thought I'd move off that until I found pt, and I never thought I'd move off that until I found rg. Heh.

tupton commented 6 years ago

Well, to match pattern.txt and path/to/patterns, you will actually need **/pattern*/**.

I had to add an extra -g "pattern.*" to match file names that end in pattern.txt or pattern.js or whatever – that is, "pattern" + an extension. **/pattern*/** matches file names like "patternfoo.txt" but not "pattern.txt".

I too would like to see an option to match a regex on filenames (including path.)

sadid commented 6 years ago

Based on my experience ag -g 'pattern' isn't equivalent to rg --files -g 'pattern' when I call ag to find files with specific pattern in the filename/path. As I tinker, find . -type f | rg 'pattern' is somehow equivalent.

okdana commented 6 years ago

They should be basically equivalent, but you might see different output depending on the exact pattern (rg's glob syntax is more robust — and more similar to git's — than ag's) as well as any ignore patterns you might have. You can try adding -u or --no-ignore-vcs, for example, if ag shows files that rg doesn't

sadid commented 6 years ago

@okdana, Here is my test case: a directory structure (without any git) called test:

test
  \_a
      \_aa
          \_ target.md
      \_test.md
   \_b
       \_target.md
   \_c

then these two commands are not equal ag -g 'target' and rg --files -g 'target' with or without -u or --no-ignore-vcs. The ag finds the target.md and put it in the output but rg doesn't.

At the moment I still use rg in other use cases but for this use cases I can use fd, a new find replacement.

okdana commented 6 years ago

Ah, that's because ag actually uses PCRE to match file paths — and it matches them against the entire path, too. For some reason i'd thought it was using its git-ish glob matching for that.

The rg equivalent to ag -g target, then, is more or less rg --files -g '*target*'.

mahiki commented 5 years ago

I also was attempting to move from ag to rg.

The first thing I tried was a PCRE regex match of file paths, which I now have learned is insufficient. Yes find is the great way to do that, but ag and rg respect ignore files, binaries, etc.

I truly enjoy finding files by name using regex patterns and not globs, in ag its simple:

ag -g 'pattern1|foo\d{4}' ~/Documents

Is there no was such a feature could be implemented?

BurntSushi commented 5 years ago

@mahiki It already exists. rg --files ~/Documents | rg 'pattern1|foo\d{4}'. Otherwise, no, the glob flags will remain glob flags. If you want a tool that specializes in listing files and can match them with regexes, then you might consider fd. Also, your regex is not specific to PCRE.

borekb commented 4 years ago

Thanks for the mention of fd! I cloned a repository somewhere on my disk and wanted to abuse ripgrep to find it; fd is the right tool for that as I need to list only directories.

WhyNotHugo commented 4 years ago

I'm trying to achieve what's been discussed on this thread, but it's not working for me, I'm not sure if I've misunderstood something, or if something's broken. Say I want to find files named vendor:

$ rg --files . -g vendor
$ rg --files vendor
vendor: No such file or directory (os error 2)
$ find . -iname vendor
./idf/core/static/www/js/vendor
./idf/core/static/www/css/vendor
./enviatufoto/static/www/js/vendor

What am I doing wrong? With ag, this is basically ag -g vendor, but I can't figure out how it works with rg.

BurntSushi commented 4 years ago

@WhyNotHugo The glob is applied to every path ripgrep traverses. Is vendor a directory? If so, vendor on its own won't match anything inside of vendor. Use rg --files -g '**/vendor/**' instead.

lamyergeier commented 4 years ago

@BurntSushi --files -g doesn't respect the contents of .gitignore

rg --files --type md -g "*Python*" "${PWD}"
BurntSushi commented 4 years ago

Please file a new and complete big report. Your comment isn't actionable.

lamyergeier commented 4 years ago

I solved this as follows:

To search Python in filenames:

Search=Python
rg --files "${PWD}" | rg --regexp "${Search}[^/]*$" | sort | nl
jjjchens235 commented 3 years ago

piggybacking off of @anishmittal2020 This function returns any file name AND directory matches, though directories themselves cannot be returned, as the --file flag returns a list of files only, directory paths are excluded.

function f() {
    #find any files or directories that match arg
    rg --files "${PWD}" | rg --regexp ".*/.*$1.*" #| sort | nl
}

Put in .bashrc, and then call it on the command line. If searching for filename or directory names that contain the word 'vendor': f vendor

Taking @WhyNotHugo example:

./idf/core/static/www/js/vendor/bar.py
./idf/vendor.py
./idf/core/static/www/js/vendor/

The first two files will be returned, the first because the path contains a dir called vendor, and the second because it contains a filename with vendor. The last file is a dir, it will not be returned.

Edit: I hate to admit it, but I should have just used fd from the get-go. It allows for regex arguments, and user can specify if they want to search for files, directories, or both.

function f() {
    #list of flags: https://github.com/sharkdp/fd#command-line-options
    #optional flags should be passed in first, then PATTERN and PATH
    fd -H -a -p  "$@"
}
AtomicNess123 commented 3 years ago

I have read the whole thread. I have not successfully managed to just find all filenames containing "helm". When I do:

rga ~/.emacs.d --files -g "*helm*"

I just finds filenames with detached "helm" (i.e., "xxx-helm-xxx.el"), but not "xxxhelmxxx.el". How to specify this? Thanks!

BurntSushi commented 3 years ago

@AtomicNess123 Works just fine for me:

$ touch xxx-helm-xxx.el xxxhelmxxx.el
$ l
total 0
-rw-rw-r-- 1 andrew users 0 Feb  7 11:40 xxx-helm-xxx.el
-rw-rw-r-- 1 andrew users 0 Feb  7 11:40 xxxhelmxxx.el
$ rg --files -g '*helm*'
xxx-helm-xxx.el
xxxhelmxxx.el
AtomicNess123 commented 3 years ago

Thanks. Actually, it does work in your example. But not when I do:

§ rg --files -g '*helm*' ~/.emacs.d

In this case, when I specify the folder to search within, it only finds the "xxx-helm-xxx.el" instances, and not the "xxxhelmxxx.el" ones.

BurntSushi commented 3 years ago

Also works just fine:

$ mkdir .emacs.d
$ touch .emacs.d/xxx-helm-xxx.el .emacs.d/xxxhelmxxx.el
$ rg --files -g '*helm*' .emacs.d/
.emacs.d/xxx-helm-xxx.el
.emacs.d/xxxhelmxxx.el

I can't think of a reason why it isn't working for you. Sorry. To be honest, it doesn't make sense to me why *helm* would match xxxhelmxxx but not xxx-helm-xxx. I suspect there is something else going awry. Please consider making a smaller reproduction.

AtomicNess123 commented 3 years ago

I won't work like this in my system.

rg --version  
ripgrep 12.1.1

What is your version?

BurntSushi commented 3 years ago

Same, but there aren't any recent changes that would impact this AFAIK. Please consider trying my exact set of commands in a fresh directory. Try running with the --debug flag. Paste its output and all the exact commands you're running.

AtomicNess123 commented 3 years ago

Other question: how many levels (subdirectories) will the command search?

BurntSushi commented 3 years ago

All...