Closed genivia-inc closed 12 months ago
On MacOS M1 machines this speedup is sometimes as much as 2x faster when combined with a more balanced choice of worker pool size.
It handsomely beats the performance other grep tools on MacOS M1 (10 core) as the updated (preview) benchmark results show:
grepping FIXME|TODO
elapsed real time (s)
search | -n | -nr | -wn | -wnr | -win | -winr | -wino | -winor | -wic | -wicr | -wil | -wilr |
---|---|---|---|---|---|---|---|---|---|---|---|---|
ugrep | 0.02 | 0.02 | 0.02 | 0.02 | 0.03 | 0.02 | ||||||
rg | 0.06 | 0.05 | 0.05 | 0.06 | 0.04 | 0.05 | ||||||
ag | 0.05 | 0.05 | 0.04 | 0.04 | 0.04 | 0.04 | ||||||
ggrep | 0.10 | 0.12 | 0.17 | 0.17 | 0.15 | 0.15 |
grepping char|int|long|size_t|void
elapsed real time (s)
search | -n | -nr | -wn | -wnr | -win | -winr | -wino | -winor | -wic | -wicr | -wil | -wilr |
---|---|---|---|---|---|---|---|---|---|---|---|---|
ugrep | 0.04 | 0.03 | 0.04 | 0.04 | 0.04 | 0.03 | ||||||
rg | 0.05 | 0.04 | 0.05 | 0.10 | 0.05 | 0.04 | ||||||
ag | 0.32 | 0.23 | 0.23 | 0.18 | 0.05 | 0.05 | ||||||
ggrep | 0.16 | 0.21 | 0.31 | 0.43 | 0.28 | 0.13 |
grepping ssl-?3(\.[0-9]+)?
elapsed real time (s)
search | -n | -nr | -wn | -wnr | -win | -winr | -wino | -winor | -wic | -wicr | -wil | -wilr |
---|---|---|---|---|---|---|---|---|---|---|---|---|
ugrep | 0.02 | 0.02 | 0.02 | 0.02 | 0.03 | 0.02 | ||||||
rg | 0.06 | 0.06 | 0.06 | 0.05 | 0.05 | 0.05 | ||||||
ag | 0.04 | 0.04 | 0.04 | 0.04 | 0.04 | 0.04 | ||||||
ggrep | 0.09 | 0.10 | 0.11 | 0.11 | 0.09 | 0.09 |
grepping _(RUN|LIB|NAM)[A-Z_]+
elapsed real time (s)
search | -n | -nr | -wn | -wnr | -win | -winr | -wino | -winor | -wic | -wicr | -wil | -wilr |
---|---|---|---|---|---|---|---|---|---|---|---|---|
ugrep | 0.12 | 0.12 | 0.12 | 0.12 | 0.16 | 0.11 | ||||||
rg | 0.24 | 0.28 | 0.24 | 0.31 | 0.28 | 0.23 | ||||||
ag | 0.25 | 0.22 | 0.23 | 0.26 | 0.24 | 0.22 | ||||||
ggrep | 0.42 | 0.51 | 0.56 | 0.56 | 0.50 | 0.48 |
grepping String|Int|Double|Array|Dictionary
elapsed real time (s)
search | -n | -nr | -wn | -wnr | -win | -winr | -wino | -winor | -wic | -wicr | -wil | -wilr |
---|---|---|---|---|---|---|---|---|---|---|---|---|
ugrep | 0.16 | 0.14 | 0.18 | 0.18 | 0.18 | 0.13 | ||||||
rg | 0.24 | 0.26 | 0.27 | 0.28 | 0.23 | 0.26 | ||||||
ag | 0.85 | 0.49 | 0.69 | 0.57 | 0.27 | 0.24 | ||||||
ggrep | 0.54 | 0.66 | 1.54 | 1.78 | 1.46 | 0.58 |
grepping (class|struct)\sS[a-z]+T
elapsed real time (s)
search | -n | -nr | -wn | -wnr | -win | -winr | -wino | -winor | -wic | -wicr | -wil | -wilr |
---|---|---|---|---|---|---|---|---|---|---|---|---|
ugrep | 0.12 | 0.12 | 0.12 | 0.12 | 0.16 | 0.11 | ||||||
rg | 0.24 | 0.25 | 0.22 | 0.26 | 0.27 | 0.24 | ||||||
ag | 0.23 | 0.22 | 0.23 | 0.23 | 0.22 | 0.23 | ||||||
ggrep | 0.56 | 0.63 | 0.81 | 0.81 | 0.75 | 0.71 |
grepping for\s[a-z]+\sin
elapsed real time (s)
search | -n | -nr | -wn | -wnr | -win | -winr | -wino | -winor | -wic | -wicr | -wil | -wilr |
---|---|---|---|---|---|---|---|---|---|---|---|---|
ugrep | 0.13 | 0.12 | 0.12 | 0.12 | 0.16 | 0.11 | ||||||
rg | 0.28 | 0.31 | 0.31 | 0.28 | 0.26 | 0.26 | ||||||
ag | 0.31 | 0.27 | 0.28 | 0.26 | 0.26 | 0.22 | ||||||
ggrep | 0.52 | 0.52 | 0.70 | 0.70 | 0.64 | 0.60 |
On MacOS Monterey Intel x64 8 core machine the speedup results are less dramatic, but still better for the Swift repo below:
grepping FIXME|TODO
elapsed real time (s)
search | -n | -nr | -wn | -wnr | -win | -winr | -wino | -winor | -wic | -wicr | -wil | -wilr |
---|---|---|---|---|---|---|---|---|---|---|---|---|
ugrep | 0.04 | 0.04 | 0.04 | 0.04 | 0.05 | 0.04 | ||||||
rg | 0.03 | 0.04 | 0.04 | 0.05 | 0.04 | 0.04 | ||||||
ag | 0.05 | 0.05 | 0.05 | 0.06 | 0.05 | 0.06 | ||||||
ggrep | 0.16 | 0.17 | 0.24 | 0.23 | 0.23 | 0.22 |
grepping char|int|long|size_t|void
elapsed real time (s)
search | -n | -nr | -wn | -wnr | -win | -winr | -wino | -winor | -wic | -wicr | -wil | -wilr |
---|---|---|---|---|---|---|---|---|---|---|---|---|
ugrep | 0.06 | 0.05 | 0.05 | 0.05 | 0.05 | 0.04 | ||||||
rg | 0.05 | 0.06 | 0.07 | 0.14 | 0.07 | 0.06 | ||||||
ag | 0.51 | 0.35 | 0.35 | 0.24 | 0.08 | 0.08 | ||||||
ggrep | 0.26 | 0.33 | 0.50 | 0.70 | 0.46 | 0.20 |
grepping ssl-?3(\.[0-9]+)?
elapsed real time (s)
search | -n | -nr | -wn | -wnr | -win | -winr | -wino | -winor | -wic | -wicr | -wil | -wilr |
---|---|---|---|---|---|---|---|---|---|---|---|---|
ugrep | 0.04 | 0.04 | 0.04 | 0.04 | 0.05 | 0.04 | ||||||
rg | 0.04 | 0.04 | 0.07 | 0.08 | 0.07 | 0.07 | ||||||
ag | 0.06 | 0.05 | 0.05 | 0.05 | 0.06 | 0.05 | ||||||
ggrep | 0.13 | 0.14 | 0.16 | 0.17 | 0.15 | 0.15 |
grepping _(RUN|LIB|NAM)[A-Z_]+
elapsed real time (s)
search | -n | -nr | -wn | -wnr | -win | -winr | -wino | -winor | -wic | -wicr | -wil | -wilr |
---|---|---|---|---|---|---|---|---|---|---|---|---|
ugrep | 0.17 | 0.17 | 0.17 | 0.17 | 0.20 | 0.16 | ||||||
rg | 0.17 | 0.18 | 0.19 | 0.19 | 0.19 | 0.19 | ||||||
ag | 0.35 | 0.37 | 0.33 | 0.35 | 0.35 | 0.36 | ||||||
ggrep | 0.67 | 0.80 | 0.88 | 0.87 | 0.86 | 0.80 |
grepping String|Int|Double|Array|Dictionary
elapsed real time (s)
search | -n | -nr | -wn | -wnr | -win | -winr | -wino | -winor | -wic | -wicr | -wil | -wilr |
---|---|---|---|---|---|---|---|---|---|---|---|---|
ugrep | 0.22 | 0.20 | 0.23 | 0.23 | 0.23 | 0.17 | ||||||
rg | 0.20 | 0.21 | 0.27 | 0.35 | 0.26 | 0.21 | ||||||
ag | 1.37 | 0.76 | 1.01 | 0.79 | 0.41 | 0.44 | ||||||
ggrep | 0.89 | 1.06 | 2.58 | 2.91 | 2.40 | 1.00 |
grepping (class|struct)\sS[a-z]+T
elapsed real time (s)
search | -n | -nr | -wn | -wnr | -win | -winr | -wino | -winor | -wic | -wicr | -wil | -wilr |
---|---|---|---|---|---|---|---|---|---|---|---|---|
ugrep | 0.18 | 0.18 | 0.18 | 0.19 | 0.22 | 0.17 | ||||||
rg | 0.18 | 0.27 | 0.36 | 0.37 | 0.36 | 0.36 | ||||||
ag | 0.35 | 0.34 | 0.36 | 0.37 | 0.34 | 0.37 | ||||||
ggrep | 0.86 | 0.93 | 1.27 | 1.25 | 1.20 | 1.11 |
grepping for\s[a-z]+\sin
elapsed real time (s)
search | -n | -nr | -wn | -wnr | -win | -winr | -wino | -winor | -wic | -wicr | -wil | -wilr |
---|---|---|---|---|---|---|---|---|---|---|---|---|
ugrep | 0.20 | 0.18 | 0.18 | 0.18 | 0.22 | 0.17 | ||||||
rg | 0.20 | 0.18 | 0.29 | 0.28 | 0.29 | 0.27 | ||||||
ag | 0.45 | 0.40 | 0.37 | 0.40 | 0.34 | 0.35 | ||||||
ggrep | 0.78 | 0.81 | 1.07 | 1.09 | 1.06 | 0.94 |
Ugrep v4.2 is released.
dtruss
shows some slowdown of recursive searching on MacOS Monterey Intel compared to e.g. Catalina on which some of the older ugrep benchmarks is based, so it wasn't noticed before. The slowdown may also be present on some other OS perhaps. The reason is a slow(er)fcntl(..., O_NONBLOCK)
that is executed for each file in recursive searches to prevent hanging on special files like/proc
and/sys
. These don't exist on Windows or MacOS, so there is no reason to execute this logic in the first place. The logic can also be optimized to avoid most of this overhead on Linux systems.With this change, the recursive search speedup is about 10% to 20%.
This issue is directly related to #193