Genivia / ugrep

NEW ugrep 6.5: a more powerful, ultra fast, user-friendly, compatible grep. Includes a TUI, Google-like Boolean search with AND/OR/NOT, fuzzy search, hexdumps, searches (nested) archives (zip, 7z, tar, pax, cpio), compressed files (gz, Z, bz2, lzma, xz, lz4, zstd, brotli), pdfs, docs, and more
https://ugrep.com
BSD 3-Clause "New" or "Revised" License
2.59k stars 109 forks source link

Faster recursive searching #297

Closed genivia-inc closed 12 months ago

genivia-inc commented 12 months ago

dtruss shows some slowdown of recursive searching on MacOS Monterey Intel compared to e.g. Catalina on which some of the older ugrep benchmarks is based, so it wasn't noticed before. The slowdown may also be present on some other OS perhaps. The reason is a slow(er) fcntl(..., O_NONBLOCK) that is executed for each file in recursive searches to prevent hanging on special files like /proc and /sys. These don't exist on Windows or MacOS, so there is no reason to execute this logic in the first place. The logic can also be optimized to avoid most of this overhead on Linux systems.

With this change, the recursive search speedup is about 10% to 20%.

This issue is directly related to #193

genivia-inc commented 12 months ago

On MacOS M1 machines this speedup is sometimes as much as 2x faster when combined with a more balanced choice of worker pool size.

It handsomely beats the performance other grep tools on MacOS M1 (10 core) as the updated (preview) benchmark results show:

results for OpenSSL source code repo directory search

grepping FIXME|TODO elapsed real time (s)

search -n -nr -wn -wnr -win -winr -wino -winor -wic -wicr -wil -wilr
ugrep 0.02 0.02 0.02 0.02 0.03 0.02
rg 0.06 0.05 0.05 0.06 0.04 0.05
ag 0.05 0.05 0.04 0.04 0.04 0.04
ggrep 0.10 0.12 0.17 0.17 0.15 0.15

grepping char|int|long|size_t|void elapsed real time (s)

search -n -nr -wn -wnr -win -winr -wino -winor -wic -wicr -wil -wilr
ugrep 0.04 0.03 0.04 0.04 0.04 0.03
rg 0.05 0.04 0.05 0.10 0.05 0.04
ag 0.32 0.23 0.23 0.18 0.05 0.05
ggrep 0.16 0.21 0.31 0.43 0.28 0.13

grepping ssl-?3(\.[0-9]+)? elapsed real time (s)

search -n -nr -wn -wnr -win -winr -wino -winor -wic -wicr -wil -wilr
ugrep 0.02 0.02 0.02 0.02 0.03 0.02
rg 0.06 0.06 0.06 0.05 0.05 0.05
ag 0.04 0.04 0.04 0.04 0.04 0.04
ggrep 0.09 0.10 0.11 0.11 0.09 0.09

results for Swift source code repo directory search

grepping _(RUN|LIB|NAM)[A-Z_]+ elapsed real time (s)

search -n -nr -wn -wnr -win -winr -wino -winor -wic -wicr -wil -wilr
ugrep 0.12 0.12 0.12 0.12 0.16 0.11
rg 0.24 0.28 0.24 0.31 0.28 0.23
ag 0.25 0.22 0.23 0.26 0.24 0.22
ggrep 0.42 0.51 0.56 0.56 0.50 0.48

grepping String|Int|Double|Array|Dictionary elapsed real time (s)

search -n -nr -wn -wnr -win -winr -wino -winor -wic -wicr -wil -wilr
ugrep 0.16 0.14 0.18 0.18 0.18 0.13
rg 0.24 0.26 0.27 0.28 0.23 0.26
ag 0.85 0.49 0.69 0.57 0.27 0.24
ggrep 0.54 0.66 1.54 1.78 1.46 0.58

grepping (class|struct)\sS[a-z]+T elapsed real time (s)

search -n -nr -wn -wnr -win -winr -wino -winor -wic -wicr -wil -wilr
ugrep 0.12 0.12 0.12 0.12 0.16 0.11
rg 0.24 0.25 0.22 0.26 0.27 0.24
ag 0.23 0.22 0.23 0.23 0.22 0.23
ggrep 0.56 0.63 0.81 0.81 0.75 0.71

grepping for\s[a-z]+\sin elapsed real time (s)

search -n -nr -wn -wnr -win -winr -wino -winor -wic -wicr -wil -wilr
ugrep 0.13 0.12 0.12 0.12 0.16 0.11
rg 0.28 0.31 0.31 0.28 0.26 0.26
ag 0.31 0.27 0.28 0.26 0.26 0.22
ggrep 0.52 0.52 0.70 0.70 0.64 0.60
genivia-inc commented 12 months ago

On MacOS Monterey Intel x64 8 core machine the speedup results are less dramatic, but still better for the Swift repo below:

results for OpenSSL source code repo directory search

grepping FIXME|TODO elapsed real time (s)

search -n -nr -wn -wnr -win -winr -wino -winor -wic -wicr -wil -wilr
ugrep 0.04 0.04 0.04 0.04 0.05 0.04
rg 0.03 0.04 0.04 0.05 0.04 0.04
ag 0.05 0.05 0.05 0.06 0.05 0.06
ggrep 0.16 0.17 0.24 0.23 0.23 0.22

grepping char|int|long|size_t|void elapsed real time (s)

search -n -nr -wn -wnr -win -winr -wino -winor -wic -wicr -wil -wilr
ugrep 0.06 0.05 0.05 0.05 0.05 0.04
rg 0.05 0.06 0.07 0.14 0.07 0.06
ag 0.51 0.35 0.35 0.24 0.08 0.08
ggrep 0.26 0.33 0.50 0.70 0.46 0.20

grepping ssl-?3(\.[0-9]+)? elapsed real time (s)

search -n -nr -wn -wnr -win -winr -wino -winor -wic -wicr -wil -wilr
ugrep 0.04 0.04 0.04 0.04 0.05 0.04
rg 0.04 0.04 0.07 0.08 0.07 0.07
ag 0.06 0.05 0.05 0.05 0.06 0.05
ggrep 0.13 0.14 0.16 0.17 0.15 0.15

results for Swift source code repo directory search

grepping _(RUN|LIB|NAM)[A-Z_]+ elapsed real time (s)

search -n -nr -wn -wnr -win -winr -wino -winor -wic -wicr -wil -wilr
ugrep 0.17 0.17 0.17 0.17 0.20 0.16
rg 0.17 0.18 0.19 0.19 0.19 0.19
ag 0.35 0.37 0.33 0.35 0.35 0.36
ggrep 0.67 0.80 0.88 0.87 0.86 0.80

grepping String|Int|Double|Array|Dictionary elapsed real time (s)

search -n -nr -wn -wnr -win -winr -wino -winor -wic -wicr -wil -wilr
ugrep 0.22 0.20 0.23 0.23 0.23 0.17
rg 0.20 0.21 0.27 0.35 0.26 0.21
ag 1.37 0.76 1.01 0.79 0.41 0.44
ggrep 0.89 1.06 2.58 2.91 2.40 1.00

grepping (class|struct)\sS[a-z]+T elapsed real time (s)

search -n -nr -wn -wnr -win -winr -wino -winor -wic -wicr -wil -wilr
ugrep 0.18 0.18 0.18 0.19 0.22 0.17
rg 0.18 0.27 0.36 0.37 0.36 0.36
ag 0.35 0.34 0.36 0.37 0.34 0.37
ggrep 0.86 0.93 1.27 1.25 1.20 1.11

grepping for\s[a-z]+\sin elapsed real time (s)

search -n -nr -wn -wnr -win -winr -wino -winor -wic -wicr -wil -wilr
ugrep 0.20 0.18 0.18 0.18 0.22 0.17
rg 0.20 0.18 0.29 0.28 0.29 0.27
ag 0.45 0.40 0.37 0.40 0.34 0.35
ggrep 0.78 0.81 1.07 1.09 1.06 0.94
genivia-inc commented 12 months ago

Ugrep v4.2 is released.