Closed mcint closed 9 months ago
I've rewritten this chained query in an extendable way (thought it suffers in keystroke cost):
<<< ~/.local/lib/python3.10/ xargs fd brotab -td | xargs fd api -e py
.
Would a word-match flag PR be welcome? Something like grep's -w
/--word-match
. I understand that ^[pattern]$
can match a full component, but I would like a syntax that can be added on to a query. For my small example (which doesn't strongly justify this request):
-w
equivalent of start & end regex anchors:
$ <<< ~/.local/lib/python3.10/ xargs fd ^brotab$ -td -1
/home/mcint/.local/lib/python3.10/site-packages/brotab/
Where other searches, with early termination, -1
, don't always yield what I want:
$ <<< ~/.local/lib/python3.10/ xargs fd brotab -td -1
/home/mcint/.local/lib/python3.10/site-packages/brotab-1.4.2.dist-info/
from this small set:
$ <<< ~/.local/lib/python3.10/ xargs fd brotab -td
/home/mcint/.local/lib/python3.10/site-packages/brotab/
/home/mcint/.local/lib/python3.10/site-packages/brotab-1.4.2.dist-info/
For example, for quickly viewing python packages, today I find myself searching for
fd brotab -td ~/.local/lib/python3.10/ -X fd api -e py
, since api is a common file and package name-component, and I'm just looking for the one today.
fd ... -X fd ...
is not something that should be recommended. The main problem is it can drastically explode the result set:
$ fd foo
foo
foo/foo
foo/foo/foo
$ fd foo -X echo fd bar # To see what would be executed
fd bar ./foo ./foo/foo ./foo/foo/foo
$ fd foo -X fd bar # What would actually happen
./foo/bar
./foo/foo/bar
./foo/foo/foo/bar
./foo/foo/bar
./foo/foo/foo/bar
./foo/foo/foo/bar
You could pass --prune
to the first fd
to avoid this, but still I don't think we should recommend this pattern at all.
For your case, it's probably best to do all the filtering in the same fd
command:
$ fd -td --full-path 'brotab.*api.*\.py$' ~/.local/lib/python3.10/
Would a word-match flag PR be welcome? Something like grep's
-w
/--word-match
. I understand that^[pattern]$
can match a full component, but I would like a syntax that can be added on to a query.
You can write word boundaries in the regex like this:
$ fd '\bpattern\b'
I don't think we'd add a flag to do this for you, fd
already has too many flags :)
Hm, thank you, interesting suggestions.
I will consider --prune
in my workflows, might try -P
for that locally, and PR. Thank you!
It looks like, in practice, I can use -g
/--glob
, #692 (in place of my -w
suggestion, https://github.com/sharkdp/fd/issues/1450#issuecomment-1852782497).
Sounds like no objections to submitting other use examples for the readme or docs, might PR later.
I've chewed on variations where I can keep appending [pattern] or [depth] [pattern] for a while.
To build the motivation a bit more, I query things like this:
fd -d3 -td [pkg] /
| xargs fd -d3 -td [lib]
| xargs fd -d3 -tf . -e ini
Compressed to: fd-chain -d3 [pkg] / -- -d3 [lib] -- -d3 -e ini .
Here are some real snippets of recent history, or for tasks I perform commonly:
fd -d4 ^php / -td | grep -ve -
fd -d4 ^php / -td | grep -ve - | xargs fd ini
fd -d4 ^php / -td | grep -ve - | xargs fd fpm
sudo apt install fzf
fd completion / -d4
fd completion / -d4 -X fd fzf -d4
fd completion / -d4 -td -X fd fzf -d4
fd fzf / -d4 -td -X fd completion -d4
. /usr/share/doc/fzf/examples/completion.bash
less /usr/share/doc/fzf/examples/completion.bash
. /usr/share/doc/fzf/examples/key-bindings.bash
Although, these examples each only use 2 steps.
fd ... -X fd ... is not something that should be recommended. The main problem is it can drastically explode the result set:
Thank you for a considered response, and I agree that blindly performing nested queries might blow up traversals & time required and results size. However, I must insist, full-path matching seems ill-advised, file systems have a really high branching factor, and searching them quickly and effortlessly (few keystrokes, forgiving argument order, concatentative/append-only use supported) is what makes fd
such a delight to use. Full path matching makes this searching much more expensive. For argument's sake, model number of files as exponential in depth, 10^[D] files are present in D levels of fs tree. I've used fd on systems where -d4 returns in acceptable time, and -d5 takes a full minute or more. Chaining queries is quite useful, to limit the haystack size.
From painful experience, I can report that searching chained from partial matches helps a lot on low-resource systems.
Nested matching names are not entirely contrived, but requerying with a more limited depth, or now glob matching are what I'll try.
Fiddling with the shell cursor to modify queries is also frustrating in practice.
Thank you for your work maintaining -- answering random usage questions, and considering design space around the tool!
Nit about full-path matching
fd ... -X fd ... is not something that should be recommended. The main problem is it can drastically explode the result set:
Thank you for a considered response, and I agree that blindly performing nested queries might blow up traversals & time required and results size. However, I must insist, full-path matching seems ill-advised, file systems have a really high branching factor, and searching them quickly and effortlessly (few keystrokes, forgiving argument order, concatentative/append-only use supported) is what makes
fd
such a delight to use.
One thing that may help concatenative use is --search-path
and --and
, e.g.
$ fd --full-path --search-path ~/.local/lib/python3.10/ /brotab/ --and api -e py
Full path matching makes this searching much more expensive.
Does it? I see how it could, but I expect I/O and syscall overhead to dominate pattern matching. Let's check:
tavianator@tachyon $ hyperfine -w2 "fd -u brotab ~" "fd -u --full-path brotab ~"
Benchmark 1: fd -u brotab ~
Time (mean ± σ): 1.151 s ± 0.014 s [User: 18.505 s, System: 33.398 s]
Range (min … max): 1.134 s … 1.180 s 10 runs
Benchmark 2: fd -u --full-path brotab ~
Time (mean ± σ): 1.151 s ± 0.008 s [User: 20.426 s, System: 32.466 s]
Range (min … max): 1.142 s … 1.164 s 10 runs
Summary
fd -u --full-path brotab ~ ran
1.00 ± 0.01 times faster than fd -u brotab ~
And here's a more representative benchmark for your use case. I changed it up because I don't have any copies of brotab
lying around.
tavianator@tachyon $ hyperfine "fd -u --search-path ~ --full-path /requests/ --and api -e py" "fd -u -td --prune --search-path ~ requests -X fd -u api -e py"
Benchmark 1: fd -u --search-path ~ --full-path /requests/ --and api -e py
Time (mean ± σ): 1.126 s ± 0.014 s [User: 14.427 s, System: 37.160 s]
Range (min … max): 1.110 s … 1.149 s 10 runs
Benchmark 2: fd -u -td --prune --search-path ~ requests -X fd -u api -e py
Time (mean ± σ): 1.156 s ± 0.012 s [User: 16.962 s, System: 35.575 s]
Range (min … max): 1.139 s … 1.181 s 10 runs
Summary
fd -u --search-path ~ --full-path /requests/ --and api -e py ran
1.03 ± 0.02 times faster than fd -u -td --prune --search-path ~ requests -X fd -u api -e py
Both queries return the same set of 110 files.
For argument's sake, model number of files as exponential in depth, 10^[D] files are present in D levels of fs tree. I've used fd on systems where -d4 returns in acceptable time, and -d5 takes a full minute or more. Chaining queries is quite useful, to limit the haystack size.
First off, you may be interested in #28 and possibly https://github.com/tavianator/bfs :)
Secondly, the total work is roughly the same for both approaches anyway. With one fd
command, it has to explore the whole tree. With --prune ... -X fd ...
, the parent fd
explores the whole tree except under the brotab
directories, and the child fd
(s) explore just the brotab
subtrees. In both cases, each path is examined by exactly one fd
process. You just have more total processes with -X fd
.
(Without --prune
, -X fd
does a lot more total work, because the parent fd
is also searching the brotab
trees along with the children.)
From painful experience, I can report that searching chained from partial matches helps a lot on low-resource systems.
I'm kind of surprised that -X fd
chaining would ever be beneficial without --prune
. I believe you, I'm just struggling to think of why that would happen.
Fiddling with the shell cursor to modify queries is also frustrating in practice.
True. One handy thing is most shells support Emacs-style keybindings for line editing, e.g. C-a
(Ctrl+A) for beginning-of-line, C-e
for end-of-line, M-b
(Alt+B) to jump back a word, M-f
to jump forward a word, etc. Often Ctrl+←/→ will work too. You can use vi-style keybindings instead with set -o vi
too.
Thank you for your work maintaining -- answering random usage questions, and considering design space around the tool!
You're welcome! :)
Using
fd
version:fd 8.7.0
I find myself using fd in a chained manner.
-X
-chained invocation style.Chained-style
For example, for quickly viewing python packages, today I find myself searching for
fd brotab -td ~/.local/lib/python3.10/ -X fd api -e py
, since api is a common file and package name-component, and I'm just looking for the one today. I find myself wanting to run commands on the result, or chain a third (or more) times. I'm looking to document that pattern for other users of fd.