Closed jsirois closed 10 months ago
Small:
$ hyperfine \
-w2 \
-p 'rm -rf ~/.pex' \
-n 'execute_parallel buildtime wheel chroot install' \
-n 'imap_parallel buildtime wheel chroot install' \
'pex --python python3.11 -D src -m main --no-pypi -f find-links cowsay==5.0 ansicolors==1.1.8 -o cowsay.ep.pex' \
'python3.11 -m pex -D src -m main --no-pypi -f find-links cowsay==5.0 ansicolors==1.1.8 -o cowsay.mp.pex'
Benchmark 1: execute_parallel buildtime wheel chroot install
Time (mean ± σ): 3.217 s ± 0.018 s [User: 3.087 s, System: 0.453 s]
Range (min … max): 3.183 s … 3.238 s 10 runs
Benchmark 2: imap_parallel buildtime wheel chroot install
Time (mean ± σ): 1.866 s ± 0.010 s [User: 1.653 s, System: 0.221 s]
Range (min … max): 1.848 s … 1.877 s 10 runs
Summary
imap_parallel buildtime wheel chroot install ran
1.72 ± 0.01 times faster than execute_parallel buildtime wheel chroot install
Medium (compare imap_parallel / execute_parallel pairs):
$ hyperfine \
-w2 \
-p 'rm -rf ~/.pex' \
-n 'raw .whl build' \
-n 'execute_parallel buildtime wheel chroot install' \
-n 'imap_parallel buildtime wheel chroot install' \
-n 'execute_parallel buildtime wheel chroot install --no-compress' \
-n 'imap_parallel buildtime wheel chroot install --no-compress' \
'python3.9 -m pex -c pants --no-pypi -f find-links --no-pre-install-wheels pantsbuild.pants==2.17.1 -o pants.whls.mp.pex' \
'pex --python python3.9 -c pants --no-pypi -f find-links pantsbuild.pants==2.17.1 -o pants.ep.pex' \
'python3.9 -m pex -c pants --no-pypi -f find-links pantsbuild.pants==2.17.1 -o pants.mp.pex' \
'pex --python python3.9 -c pants --no-pypi -f find-links pantsbuild.pants==2.17.1 --no-compress -o pants.ep.nc.pex' \
'python3.9 -m pex -c pants --no-pypi -f find-links pantsbuild.pants==2.17.1 --no-compress -o pants.mp.nc.pex'
Benchmark 1: raw .whl build
Time (mean ± σ): 1.680 s ± 0.011 s [User: 1.408 s, System: 0.214 s]
Range (min … max): 1.662 s … 1.693 s 10 runs
Benchmark 2: execute_parallel buildtime wheel chroot install
Time (mean ± σ): 9.265 s ± 0.048 s [User: 12.653 s, System: 0.948 s]
Range (min … max): 9.168 s … 9.339 s 10 runs
Benchmark 3: imap_parallel buildtime wheel chroot install
Time (mean ± σ): 7.117 s ± 0.032 s [User: 6.967 s, System: 0.551 s]
Range (min … max): 7.077 s … 7.183 s 10 runs
Benchmark 4: execute_parallel buildtime wheel chroot install --no-compress
Time (mean ± σ): 5.135 s ± 0.064 s [User: 8.411 s, System: 1.015 s]
Range (min … max): 5.071 s … 5.305 s 10 runs
Benchmark 5: imap_parallel buildtime wheel chroot install --no-compress
Time (mean ± σ): 3.067 s ± 0.017 s [User: 2.816 s, System: 0.629 s]
Range (min … max): 3.042 s … 3.097 s 10 runs
Summary
raw .whl build ran
1.82 ± 0.02 times faster than imap_parallel buildtime wheel chroot install --no-compress
3.06 ± 0.04 times faster than execute_parallel buildtime wheel chroot install --no-compress
4.24 ± 0.03 times faster than imap_parallel buildtime wheel chroot install
5.51 ± 0.05 times faster than execute_parallel buildtime wheel chroot install
$ du -sh pants*.pex | sort -n
52M pants.whls.mp.pex
53M pants.ep.pex
53M pants.mp.pex
239M pants.ep.nc.pex
239M pants.mp.nc.pex
Small (imap_parallel is better, but parallelization is still a small loss):
$ pex \
--python python3.11 \
-D src -m main \
--no-pypi -f find-links cowsay==5.0 ansicolors==1.1.8 \
--no-pre-install-wheels -o cowsay.whls.ep.pex
$ python3.11 -m pex \
-D src -m main \
--no-pypi -f find-links cowsay==5.0 ansicolors==1.1.8 \
--no-pre-install-wheels -o cowsay.whls.mp.pex
$ hyperfine \
-w2 \
-p 'rm -rf ~/.pex' \
-n 'serial wheel chroot install' \
-n 'execute_parallel runtime wheel chroot install' \
-n 'imap_parallel runtime wheel chroot install' \
'./cowsay.whls.ep.pex' \
'PEX_MAX_INSTALL_JOBS=0 ./cowsay.whls.ep.pex' \
'PEX_MAX_INSTALL_JOBS=0 ./cowsay.whls.mp.pex'
Benchmark 1: serial wheel chroot install
Time (mean ± σ): 493.0 ms ± 3.4 ms [User: 449.3 ms, System: 43.5 ms]
Range (min … max): 488.7 ms … 498.1 ms 10 runs
Benchmark 2: execute_parallel runtime wheel chroot install
Time (mean ± σ): 574.4 ms ± 9.0 ms [User: 589.3 ms, System: 69.7 ms]
Range (min … max): 567.6 ms … 597.8 ms 10 runs
Benchmark 3: imap_parallel runtime wheel chroot install
Time (mean ± σ): 512.5 ms ± 3.0 ms [User: 538.3 ms, System: 57.9 ms]
Range (min … max): 508.9 ms … 518.0 ms 10 runs
Summary
serial wheel chroot install ran
1.04 ± 0.01 times faster than imap_parallel runtime wheel chroot install
1.17 ± 0.02 times faster than execute_parallel runtime wheel chroot install
Medium:
$ pex \
--python python3.9 \
-c pants \
--no-pypi -f find-links pantsbuild.pants==2.17.1 \
-o pants.ep.pex
$ python3.9 -m pex \
-c pants \
--no-pypi -f find-links pantsbuild.pants==2.17.1 \
-o pants.mp.pex
$ pex \
--python python3.9 \
-c pants \
--no-pypi -f find-links pantsbuild.pants==2.17.1 \
--no-pre-install-wheels -o pants.whls.ep.pex
$ python3.9 -m pex \
-c pants \
--no-pypi -f find-links pantsbuild.pants==2.17.1 \
--no-pre-install-wheels -o pants.whls.mp.pex
$ hyperfine \
-w2 \
-p 'rm -rf ~/.pex' \
-n 'serial wheel chroot install' \
-n 'serial .whl file install' \
-n 'execute_parallel runtime wheel chroot install' \
-n 'imap_parallel runtime wheel chroot install' \
-n 'execute_parallel runtime .whl install' \
-n 'imap_parallel runtime .whl install' \
'./pants.ep.pex -V' \
'./pants.whls.mp.pex -V' \
'PEX_MAX_INSTALL_JOBS=0 ./pants.ep.pex -V' \
'PEX_MAX_INSTALL_JOBS=0 ./pants.mp.pex -V' \
'PEX_MAX_INSTALL_JOBS=0 ./pants.whls.ep.pex -V' \
'PEX_MAX_INSTALL_JOBS=0 ./pants.whls.mp.pex -V'
Benchmark 1: serial wheel chroot install
Time (mean ± σ): 2.589 s ± 0.026 s [User: 2.217 s, System: 0.241 s]
Range (min … max): 2.551 s … 2.639 s 10 runs
Benchmark 2: serial .whl file install
Time (mean ± σ): 2.861 s ± 0.047 s [User: 2.451 s, System: 0.274 s]
Range (min … max): 2.804 s … 2.943 s 10 runs
Benchmark 3: execute_parallel runtime wheel chroot install
Time (mean ± σ): 2.814 s ± 0.029 s [User: 5.343 s, System: 0.454 s]
Range (min … max): 2.782 s … 2.888 s 10 runs
Benchmark 4: imap_parallel runtime wheel chroot install
Time (mean ± σ): 2.449 s ± 0.030 s [User: 2.550 s, System: 0.274 s]
Range (min … max): 2.408 s … 2.515 s 10 runs
Benchmark 5: execute_parallel runtime .whl install
Time (mean ± σ): 2.904 s ± 0.039 s [User: 6.545 s, System: 0.618 s]
Range (min … max): 2.860 s … 2.978 s 10 runs
Benchmark 6: imap_parallel runtime .whl install
Time (mean ± σ): 2.587 s ± 0.026 s [User: 2.864 s, System: 0.344 s]
Range (min … max): 2.555 s … 2.638 s 10 runs
Summary
imap_parallel runtime wheel chroot install ran
1.06 ± 0.02 times faster than imap_parallel runtime .whl install
1.06 ± 0.02 times faster than serial wheel chroot install
1.15 ± 0.02 times faster than execute_parallel runtime wheel chroot install
1.17 ± 0.02 times faster than serial .whl file install
1.19 ± 0.02 times faster than execute_parallel runtime .whl install
Looking now
Now both the build time resolve code and the run time layout code use the same parallelization logic to install wheels using
pex.pep_427
via a new pair ofpex.jobs.{imap,map}_parallel
functions.Previously, both used
pex.jobs.execute_parallel
, which incurs a fork/exec per processed item along with the ensuing overhead of re-importing all the Pex code needed to do apex.pep_427
wheel install. Although this makes sense for calling Pip, which shares no code with Pex, it is wasted effort to call pure Pex code. Although early experiments with parallelizingpex.pep_427
wheel installs with a thread pool showedpex.jobs.execute_parallel
to perform consistently better, I never experimented with multiprocessing process-based pools. These perform better than both; and, in hindsight, for two obvious reasons:pex.jobs.execute_parallel
. As a result, the import price is paid at most once per slot instead of once per job input.