alexevanczuk / packs

A pure Rust implementation of packwerk, a gradual modularization tool for Ruby
MIT License
70 stars 7 forks source link

`pks --no-cache check` can be faster than `pks check` #217

Open oleg-vinted opened 2 weeks ago

oleg-vinted commented 2 weeks ago

At least on my machine, disabling cache makes pks check run faster (wall-clock time):

$ hyperfine --warmup 5 'pks check' 'pks --no-cache check'
Benchmark 1: pks check
  Time (mean ± σ):      1.542 s ±  0.141 s    [User: 0.936 s, System: 10.009 s]
  Range (min … max):    1.331 s …  1.704 s    10 runs

Benchmark 2: pks --no-cache check
  Time (mean ± σ):      1.058 s ±  0.036 s    [User: 3.781 s, System: 3.744 s]
  Range (min … max):    1.018 s …  1.134 s    10 runs

Summary
  pks --no-cache check ran
    1.46 ± 0.14 times faster than pks check
$ find tmp/cache/packwerk | wc -l
   17403
$ du -sh tmp/cache/packwerk
 72M    tmp/cache/packwerk

macOS 14.7 on Apple M1 Pro 10C

alexevanczuk commented 2 weeks ago

Huh, that's very interesting. For me:

$ hyperfine --warmup 5 'pks check' 'pks --no-cache check'

Benchmark 1: pks check
  Time (mean ± σ):     264.0 ms ±  20.4 ms    [User: 756.0 ms, System: 790.2 ms]
  Range (min … max):   240.0 ms … 297.5 ms    10 runs

Benchmark 2: pks --no-cache check
  Time (mean ± σ):     510.0 ms ±  20.7 ms    [User: 3715.4 ms, System: 481.7 ms]
  Range (min … max):   483.8 ms … 555.9 ms    10 runs

Summary
  pks check ran
    1.93 ± 0.17 times faster than pks --no-cache check

macOs 15.0 on Apple M2 Max

I wonder why this is 🤔

oleg-vinted commented 1 week ago

We are not running pks against the same codebase, but what definitely stands out is the System time: it's 12.7x higher when using cache and 7.8x times higher when not using cache. User times are comparable.

So it turns out that file access uses a lot more kernel CPU time on my machine, probably because of some security software we run. 🤷

oleg-vinted commented 1 week ago

In an Ubuntu VM running on the same laptop:

$ hyperfine --warmup 5 'pks check' 'pks --no-cache check'
Benchmark 1: pks check
  Time (mean ± σ):     315.5 ms ±   5.7 ms    [User: 816.8 ms, System: 373.1 ms]
  Range (min … max):   306.8 ms … 324.4 ms    10 runs

Benchmark 2: pks --no-cache check
  Time (mean ± σ):     592.6 ms ±   6.3 ms    [User: 3170.7 ms, System: 285.6 ms]
  Range (min … max):   587.1 ms … 606.1 ms    10 runs

Summary
  pks check ran
    1.88 ± 0.04 times faster than pks --no-cache check

So yeah, it must be something about the macOS host that makes it slow.

alexevanczuk commented 1 week ago

The security software explanation makes sense to me, although I'm not sure what it would be doing different. With no cache, we'd be opening repo files to parse them. With cache, we're opening up files in tmp/cache. Is it possible that file opens on codebase files are allow listed in your software but not tmp?