Open alok opened 6 years ago
Is there really a strong use case for this?
We already have the --exclude <pattern>
option which lets you specify a glob pattern that should be excluded. While not exactly the same, I'm not sure if there is a need for an option that would invert the pattern.
or is it there some simple way to specify it in the pattern regex?
Not really. Negative lookaheads can be abused for that but (a) that's not really practical and (b) I don't think they are supported by the regex crate.
A workaround would be fd | grep -v pattern
.
Personally, I needed it to remove all files directory which did not fit a given criteria. For instance, deleting all files not starting with the prefix "keep":
fd '^(?!keep)' --exec rm {}
Even if fancy regexes aren't supported, an --invert
option would probably help in most use cases:
fd 'keep' --invert --exec rm {}
@SicariusNoctis --exclude
would work for your case as well:
fd -E 'keep*' -X rm
@SicariusNoctis
--exclude
would work for your case as well:fd -E 'keep*' -X rm
@sharkdp Why not add an --exclude-regex
?
I too would like an --exclude-regex
. I am extremely familiar with regex, but not at all with glob patterns. And since fd
does not support negative lookahead, it's impossible to do that kind of thing in a regex pattern.
Ok, reopening for now.
I just ran into the issue of a lack of exclude patterns myself. I'm working on some Blender addon and I need to exclude just one __init__.py
file from the root folder. I have to use something like: fdfind --type file | rg -v '^\./__init__.py'
cause the glob pattern excludes all __init__.py
from all folders.
@razcore-rad I'm pretty sure you could use --exclude=/__init__.py
for that. (note that --exclude
actually uses the same syntax as .gitignore).
@razcore-rad I'm pretty sure you could use
--exclude=/__init__.py
for that. (note that--exclude
actually uses the same syntax as .gitignore).
I see. That's handy. I looked at the man page but it didn't mention the specific syntax. I did try --exclude './__init__.py
, but that didn't help, /__init__.py
works for my use case, thanks.
It uses the same syntax as .gitignore
ripgrep supports this as well within --glob
:
Precede a glob with a ! to exclude it.
@M1cha fd
already has an --exclude
flag that uses globs. This issue is for excluding using a regex pattern rather than a glob pattern.
Using globbing instead of regex for --exclude
is enough for my use-case, so I am perfectly OK with this as is. However, I do find it a bit confusing, and am not the only one (see #1264). Would it make sense to change the behaviour of --exclude
to use regex by default, and glob patterns with the --glob
option? That would make --exclude
consistent with the normal behaviour of fd
.
This would be breaking behaviour though, so perhaps that's not acceptable.
I don't think we could do that. Not only is it a breaking change, but it would break things in a subtle way where some existing usages would work, some wouldn't work at all, and others would work sometimes.
Also, fwiw, the current --exclude option is designed to be consistent with entries in an ignore file.
I'm writing a script to calculate the total playtime of all videos in a directory recursively using fd
. The files are named with serial numbers preceding them, for example:
./some_dir/1) filename.mp4
./some_dir/2) filename.mp4
When I want to list the files with serial numbers from 1 to 4, I use fd '^[1-4]\)'
, but fd | grep '^[1-4])'
does not work in this scenario because the full path is passed from fd
to grep
and therefore ^[1-4])
matches the beginning of the full path and not filename.
However, when I need an inverted match, fd
currently doesn't support this, and grep
fails because it matches against the full path. I can't use basename
in fd --exec basename {} \; | grep -v '^[1-4]\)'
because I need the full path for another command ffprobe
.
@aqdasak For your use case you could use
$ fd | grep -E '(^|/)[1-4]\)'
$ fd | grep -Ev '(^|/)[1-4]\)'
@tavianator
The command $ fd | grep -E '(^|/)[1-4]\)'
also matches "1) some_dir/" and all its content, which is not the intended outcome. The goal is to match only the filename and not its parent directories.
Currently, I'm using the following method (in fish shell):
for i in (fd)
basename $i | grep -i '^[1-4])' >/dev/null && echo $i
end
and
for i in (fd)
basename $i | grep -iv '^[1-4])' >/dev/null && echo $i
end
Oh right, then something like fd | grep -E '(^|/)[1-4]\)[^/]*$'
Using a second program will negate advantages such as coloring.
For me, this is the problem I run into unfortunately frequently enough for me to find this thread: I have a handful of nested subdirectories all of which are relatively small, while one being extremely huge. I try to exclude the huge directory by following the same rules as the default search syntax: smart-case regex. After it doesn't work I have to ctrl-c to cancel the vomit of unwanted matches, remember that it's a case-sensitive glob instead, read the directory name of the one I want to exclude more carefully, match the case sensitivity, and wrap my pattern in asterisks. All of which takes multiple attempts because I forget either the case-sensitivity or the asterisks.
Piping to another program has 2 disadvantages, as NightMachinery said you lose the coloring, and it also dramatically increases the search time because it has to match everything you want to exclude before you can actually exclude it.
I actually came here hoping there was an option to change the behavior of --exclude from a case-sensitive glob to a smart-case regex (it would be so nice for consistency). Now to avoid the breaking change of changing its default behavior across the board, what about an environment variable or flag that the user can manually choose to set in their alias (or if the config file ever becomes a thing) that would change the default --exclude behavior from case-sensitive glob to smart-case regex.
Now to avoid the breaking change of changing its default behavior across the board, what about an environment variable or flag that the user can manually choose to set in their alias (or if the config file ever becomes a thing) that would change the default --exclude behavior from case-sensitive glob to smart-case regex.
Unfortunately, that can break programmatic usages of fd
. See discussion on https://github.com/sharkdp/fd/issues/362
Ya, I see it now for env variables but an alias flag should at least be fine :/.
Some projects have different commands for the same tool that provides extra/differing functionality. Broot has 'br' and zoxide has 'z' but the original commands can still be used. They also change things and aren't just a shorter invocation of the same thing. So an idea could be to somehow enable a different command to be called entirely that would allow users to more extremely deviate from the default behavior of fd without touching the original command. It could allow for a more obvious indication that this is being called interactively vs programmatically. That it would be more volatile with more users' unique preferences while allowing the original functionality of the fd command to remain consistent and intact. Because this concern of not messing with the original programmatic functionality seems to pop-up across different requests, maybe it could help address this in various areas.
At the very least, I do hope a standalone new flag can be added. If it's a dumb idea just ignore me. I just wanted to leave the idea out there in the unlikely case some ideas can be gained from it (maybe with enough ideas something will be appealing).
In these two examples they use shell integrations which sounds very much against this projects ideals of wanting a simple consistent result across OS's and probably not wanting to add a bunch of unique shell integrations for various platforms. So the exact implementation doesn't have to be the same, I'm just speaking more to the concept as a whole.
Edit: ehh, this basically sounds exactly like your (@tmccombs) suggestion already from the end of https://github.com/sharkdp/fd/issues/362#issuecomment-2081310265 and it got downvoted :(.
I noticed
grep
has a-L
flag to find filenames that don't contain the search pattern. What about the related operation of finding the complement of a pattern? Would a flag for that be useful, or is it there some simple way to specify it in the pattern regex?