pex-tool / pex

A tool for generating .pex (Python EXecutable) files, lock files and venvs.
https://docs.pex-tool.org/
Apache License 2.0
2.54k stars 258 forks source link

Lower noop wheel install overhead. #2315

Closed jsirois closed 9 months ago

jsirois commented 9 months ago

Previously, installing a wheel that was already installed incurred the cost of hashing the installed wheel chroot every time. The overhead of this wasted work for a warm cache was egregious for large distributions like PyTorch, with gigabytes of files to hash taking seconds.

Work towards #2312.

jsirois commented 9 months ago

This drops one of the two ~3s timings in #2312 ("Installing 22 distributions ...") to ~13ms for a warm cache:

$ time python3.11 -mpex.cli venv create -v --force -d ./tenv  --no-build --lock torch.lock 'torch'
pex: Using cached artifact at /home/jsirois/.pex/downloads/a6ebbe517097ef289cc7952783588c72de071d4b15ce0f8b285093f0916b1162 for FileArtifact(url='https://files.pythonhosted.org/packages/da/6a/7fb9d82db4568834ff6d4df2fe3b143de4ed65a3f8f93e7daed703626cb6/torch-2.1.2-cp311-cp311-manylinux1_x86_64.whl', fingerprint=Fingerprint(algorithm='sha256', hash='a6ebbe517097ef289cc7952783588c72de071d4b15ce0f8b285093f0916b1162'), verified=False, filename='torch-2.1.2-cp311-cp311-manylinux1_x86_64.whl')
pex: Using cached artifact at /home/jsirois/.pex/downloads/d800d87f72189a745fa3d6b033b9dc4a34ad069f60ca60b943a63599f5501960 for FileArtifact(url='https://files.pythonhosted.org/packages/70/25/fab23259a52ece5670dcb8452e1af34b89e6135ecc17cd4b54b4b479eac6/fsspec-2023.12.2-py3-none-any.whl', fingerprint=Fingerprint(algorithm='sha256', hash='d800d87f72189a745fa3d6b033b9dc4a34ad069f60ca60b943a63599f5501960'), verified=False, filename='fsspec-2023.12.2-py3-none-any.whl')
pex: Using cached artifact at /home/jsirois/.pex/downloads/6088930bfe239f0e6710546ab9c19c9ef35e29792895fed6e6e31a023a182a61 for FileArtifact(url='https://files.pythonhosted.org/packages/bc/c3/f068337a370801f372f2f8f6bad74a5c140f6fda3d9de154052708dd3c65/Jinja2-3.1.2-py3-none-any.whl', fingerprint=Fingerprint(algorithm='sha256', hash='6088930bfe239f0e6710546ab9c19c9ef35e29792895fed6e6e31a023a182a61'), verified=False, filename='Jinja2-3.1.2-py3-none-any.whl')
pex: Using cached artifact at /home/jsirois/.pex/downloads/ee53ccca76a6fc08fb9701aa95b6ceb242cdaab118c3bb152af4e579af792728 for FileArtifact(url='https://files.pythonhosted.org/packages/37/6d/121efd7382d5b0284239f4ab1fc1590d86d34ed4a4a2fdb13b30ca8e5740/nvidia_cublas_cu12-12.1.3.1-py3-none-manylinux1_x86_64.whl', fingerprint=Fingerprint(algorithm='sha256', hash='ee53ccca76a6fc08fb9701aa95b6ceb242cdaab118c3bb152af4e579af792728'), verified=False, filename='nvidia_cublas_cu12-12.1.3.1-py3-none-manylinux1_x86_64.whl')
pex: Using cached artifact at /home/jsirois/.pex/downloads/e54fde3983165c624cb79254ae9818a456eb6e87a7fd4d56a2352c24ee542d7e for FileArtifact(url='https://files.pythonhosted.org/packages/7e/00/6b218edd739ecfc60524e585ba8e6b00554dd908de2c9c66c1af3e44e18d/nvidia_cuda_cupti_cu12-12.1.105-py3-none-manylinux1_x86_64.whl', fingerprint=Fingerprint(algorithm='sha256', hash='e54fde3983165c624cb79254ae9818a456eb6e87a7fd4d56a2352c24ee542d7e'), verified=False, filename='nvidia_cuda_cupti_cu12-12.1.105-py3-none-manylinux1_x86_64.whl')
pex: Using cached artifact at /home/jsirois/.pex/downloads/57dbda9b35157b05fb3e58ee91448612eb674172fab98ee235ccb0b5bee19a1c for FileArtifact(url='https://files.pythonhosted.org/packages/81/54/84d42a0bee35edba99dee7b59a8d4970eccdd44b99fe728ed912106fc781/filelock-3.13.1-py3-none-any.whl', fingerprint=Fingerprint(algorithm='sha256', hash='57dbda9b35157b05fb3e58ee91448612eb674172fab98ee235ccb0b5bee19a1c'), verified=False, filename='filelock-3.13.1-py3-none-any.whl')
pex: Using cached artifact at /home/jsirois/.pex/downloads/f18c69adc97877c42332c170849c96cefa91881c99a7cb3e95b7c659ebdc1ec2 for FileArtifact(url='https://files.pythonhosted.org/packages/d5/f0/8fbc882ca80cf077f1b246c0e3c3465f7f415439bdea6b899f6b19f61f70/networkx-3.2.1-py3-none-any.whl', fingerprint=Fingerprint(algorithm='sha256', hash='f18c69adc97877c42332c170849c96cefa91881c99a7cb3e95b7c659ebdc1ec2'), verified=False, filename='networkx-3.2.1-py3-none-any.whl')
pex: Using cached artifact at /home/jsirois/.pex/downloads/339b385f50c309763ca65456ec75e17bbefcbbf2893f462cb8b90584cd27a1c2 for FileArtifact(url='https://files.pythonhosted.org/packages/b6/9f/c64c03f49d6fbc56196664d05dba14e3a561038a81a638eeb47f4d4cfd48/nvidia_cuda_nvrtc_cu12-12.1.105-py3-none-manylinux1_x86_64.whl', fingerprint=Fingerprint(algorithm='sha256', hash='339b385f50c309763ca65456ec75e17bbefcbbf2893f462cb8b90584cd27a1c2'), verified=False, filename='nvidia_cuda_nvrtc_cu12-12.1.105-py3-none-manylinux1_x86_64.whl')
pex: Using cached artifact at /home/jsirois/.pex/downloads/6e258468ddf5796e25f1dc591a31029fa317d97a0a94ed93468fc86301d61e40 for FileArtifact(url='https://files.pythonhosted.org/packages/eb/d5/c68b1d2cdfcc59e72e8a5949a37ddb22ae6cade80cd4a57a84d4c8b55472/nvidia_cuda_runtime_cu12-12.1.105-py3-none-manylinux1_x86_64.whl', fingerprint=Fingerprint(algorithm='sha256', hash='6e258468ddf5796e25f1dc591a31029fa317d97a0a94ed93468fc86301d61e40'), verified=False, filename='nvidia_cuda_runtime_cu12-12.1.105-py3-none-manylinux1_x86_64.whl')
pex: Using cached artifact at /home/jsirois/.pex/downloads/5ccb288774fdfb07a7e7025ffec286971c06d8d7b4fb162525334616d7629ff9 for FileArtifact(url='https://files.pythonhosted.org/packages/ff/74/a2e2be7fb83aaedec84f391f082cf765dfb635e7caa9b49065f73e4835d8/nvidia_cudnn_cu12-8.9.2.26-py3-none-manylinux1_x86_64.whl', fingerprint=Fingerprint(algorithm='sha256', hash='5ccb288774fdfb07a7e7025ffec286971c06d8d7b4fb162525334616d7629ff9'), verified=False, filename='nvidia_cudnn_cu12-8.9.2.26-py3-none-manylinux1_x86_64.whl')
pex: Using cached artifact at /home/jsirois/.pex/downloads/794e3948a1aa71fd817c3775866943936774d1c14e7628c74f6f7417224cdf56 for FileArtifact(url='https://files.pythonhosted.org/packages/86/94/eb540db023ce1d162e7bea9f8f5aa781d57c65aed513c33ee9a5123ead4d/nvidia_cufft_cu12-11.0.2.54-py3-none-manylinux1_x86_64.whl', fingerprint=Fingerprint(algorithm='sha256', hash='794e3948a1aa71fd817c3775866943936774d1c14e7628c74f6f7417224cdf56'), verified=False, filename='nvidia_cufft_cu12-11.0.2.54-py3-none-manylinux1_x86_64.whl')
pex: Using cached artifact at /home/jsirois/.pex/downloads/8a7ec542f0412294b15072fa7dab71d31334014a69f953004ea7a118206fe0dd for FileArtifact(url='https://files.pythonhosted.org/packages/bc/1d/8de1e5c67099015c834315e333911273a8c6aaba78923dd1d1e25fc5f217/nvidia_cusolver_cu12-11.4.5.107-py3-none-manylinux1_x86_64.whl', fingerprint=Fingerprint(algorithm='sha256', hash='8a7ec542f0412294b15072fa7dab71d31334014a69f953004ea7a118206fe0dd'), verified=False, filename='nvidia_cusolver_cu12-11.4.5.107-py3-none-manylinux1_x86_64.whl')
pex: Using cached artifact at /home/jsirois/.pex/downloads/9d264c5036dde4e64f1de8c50ae753237c12e0b1348738169cd0f8a536c0e1e0 for FileArtifact(url='https://files.pythonhosted.org/packages/44/31/4890b1c9abc496303412947fc7dcea3d14861720642b49e8ceed89636705/nvidia_curand_cu12-10.3.2.106-py3-none-manylinux1_x86_64.whl', fingerprint=Fingerprint(algorithm='sha256', hash='9d264c5036dde4e64f1de8c50ae753237c12e0b1348738169cd0f8a536c0e1e0'), verified=False, filename='nvidia_curand_cu12-10.3.2.106-py3-none-manylinux1_x86_64.whl')
pex: Using cached artifact at /home/jsirois/.pex/downloads/1a6c4acefcbebfa6de320f412bf7866de856e786e0462326ba1bac40de0b5e71 for FileArtifact(url='https://files.pythonhosted.org/packages/a4/05/23f8f38eec3d28e4915725b233c24d8f1a33cb6540a882f7b54be1befa02/nvidia_nccl_cu12-2.18.1-py3-none-manylinux1_x86_64.whl', fingerprint=Fingerprint(algorithm='sha256', hash='1a6c4acefcbebfa6de320f412bf7866de856e786e0462326ba1bac40de0b5e71'), verified=False, filename='nvidia_nccl_cu12-2.18.1-py3-none-manylinux1_x86_64.whl')
pex: Using cached artifact at /home/jsirois/.pex/downloads/f3b50f42cf363f86ab21f720998517a659a48131e8d538dc02f8768237bd884c for FileArtifact(url='https://files.pythonhosted.org/packages/65/5b/cfaeebf25cd9fdec14338ccb16f6b2c4c7fa9163aefcf057d86b9cc248bb/nvidia_cusparse_cu12-12.1.0.106-py3-none-manylinux1_x86_64.whl', fingerprint=Fingerprint(algorithm='sha256', hash='f3b50f42cf363f86ab21f720998517a659a48131e8d538dc02f8768237bd884c'), verified=False, filename='nvidia_cusparse_cu12-12.1.0.106-py3-none-manylinux1_x86_64.whl')
pex: Using cached artifact at /home/jsirois/.pex/downloads/dc21cf308ca5691e7c04d962e213f8a4aa9bbfa23d95412f452254c2caeb09e5 for FileArtifact(url='https://files.pythonhosted.org/packages/da/d3/8057f0587683ed2fcd4dbfbdfdfa807b9160b809976099d36b8f60d08f03/nvidia_nvtx_cu12-12.1.105-py3-none-manylinux1_x86_64.whl', fingerprint=Fingerprint(algorithm='sha256', hash='dc21cf308ca5691e7c04d962e213f8a4aa9bbfa23d95412f452254c2caeb09e5'), verified=False, filename='nvidia_nvtx_cu12-12.1.105-py3-none-manylinux1_x86_64.whl')
pex: Using cached artifact at /home/jsirois/.pex/downloads/919b06453f0033ea52c13eaf7833de0e57db3178d23d4e04f9fc71c4f2c32bf8 for FileArtifact(url='https://files.pythonhosted.org/packages/5c/c1/54fffb2eb13d293d9a429fead3646752ea190de0229bcf3d591ba2481263/triton-2.1.0-0-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.whl', fingerprint=Fingerprint(algorithm='sha256', hash='919b06453f0033ea52c13eaf7833de0e57db3178d23d4e04f9fc71c4f2c32bf8'), verified=False, filename='triton-2.1.0-0-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.whl')
pex: Using cached artifact at /home/jsirois/.pex/downloads/c3588cd4295d0c0f603d0f2ae780587e64e2efeedb3521e46b9bb1d08d184fa5 for FileArtifact(url='https://files.pythonhosted.org/packages/d2/05/e6600db80270777c4a64238a98d442f0fd07cc8915be2a1c16da7f2b9e74/sympy-1.12-py3-none-any.whl', fingerprint=Fingerprint(algorithm='sha256', hash='c3588cd4295d0c0f603d0f2ae780587e64e2efeedb3521e46b9bb1d08d184fa5'), verified=False, filename='sympy-1.12-py3-none-any.whl')
pex: Using cached artifact at /home/jsirois/.pex/downloads/af72aea155e91adfc61c3ae9e0e342dbc0cba726d6cba4b6c72c1f34e47291cd for FileArtifact(url='https://files.pythonhosted.org/packages/b7/f4/6a90020cd2d93349b442bfcb657d0dc91eee65491600b2cb1d388bc98e6b/typing_extensions-4.9.0-py3-none-any.whl', fingerprint=Fingerprint(algorithm='sha256', hash='af72aea155e91adfc61c3ae9e0e342dbc0cba726d6cba4b6c72c1f34e47291cd'), verified=False, filename='typing_extensions-4.9.0-py3-none-any.whl')
pex: Using cached artifact at /home/jsirois/.pex/downloads/bfce63a9e7834b12b87c64d6b155fdd9b3b96191b6bd334bf37db7ff1fe457f2 for FileArtifact(url='https://files.pythonhosted.org/packages/fe/21/2eff1de472ca6c99ec3993eab11308787b9879af9ca8bbceb4868cf4f2ca/MarkupSafe-2.1.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl', fingerprint=Fingerprint(algorithm='sha256', hash='bfce63a9e7834b12b87c64d6b155fdd9b3b96191b6bd334bf37db7ff1fe457f2'), verified=False, filename='MarkupSafe-2.1.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl')
pex: Using cached artifact at /home/jsirois/.pex/downloads/64335a8088e2b9d196ae8665430bc6a2b7e6ef2eb877a9c735c804bd4ff6467c for FileArtifact(url='https://files.pythonhosted.org/packages/1e/07/bf730d44c2fe1b676ad9cc2be5f5f861eb5d153fb6951987a2d6a96379a9/nvidia_nvjitlink_cu12-12.3.101-py3-none-manylinux1_x86_64.whl', fingerprint=Fingerprint(algorithm='sha256', hash='64335a8088e2b9d196ae8665430bc6a2b7e6ef2eb877a9c735c804bd4ff6467c'), verified=False, filename='nvidia_nvjitlink_cu12-12.3.101-py3-none-manylinux1_x86_64.whl')
pex: Using cached artifact at /home/jsirois/.pex/downloads/a0b2b9fe80bbcd81a6647ff13108738cfb482d481d826cc0e02f5b35e5c88d2c for FileArtifact(url='https://files.pythonhosted.org/packages/43/e3/7d92a15f894aa0c9c4b49b8ee9ac9850d6e63b03c9c32c0367a13ae62209/mpmath-1.3.0-py3-none-any.whl', fingerprint=Fingerprint(algorithm='sha256', hash='a0b2b9fe80bbcd81a6647ff13108738cfb482d481d826cc0e02f5b35e5c88d2c'), verified=False, filename='mpmath-1.3.0-py3-none-any.whl')
pex: Resolving distributions :: Resolving requirements from lock file torch.lock :: Building 0 artifacts and installing 22 :: Calculating project names for direct requirements:
  PyPIRequirement(line=LogicalLine(raw_text='torch', processed_text='torch', source='<string>', start_line=1, end_line=1), requirement=Requirement(name='torch', url=None, extras=frozenset(), specifier=<Specifiepex: Resolving distributions :: Resolving requirements from lock file torch.lock :: Building 0 artifacts and installing 22 :: Installing 22 distributions

pex: Resolving distributions: 138.2ms
pex:   Parsing lock torch.lock: 19.6ms
pex:   Resolving requirements from lock file torch.lock: 118.4ms
pex:     Parsing requirements: 0.4ms
pex:     Resolving urls to fetch for 1 requirements from lock torch.lock: 9.2ms
pex:     Hashing pex: 14.7ms
pex:     Isolating pex: 0.0ms
pex:     Downloading 22 distributions to satisfy 1 requirements: 15.6ms
pex:     Categorizing 22 downloaded artifacts: 0.1ms
pex:     Building 0 artifacts and installing 22: 76.7ms
pex:       Calculating project names for direct requirements:
  PyPIRequirement(line=LogicalLine(raw_text='torch', processed_text='torch', source='<string>', start_line=1, end_line=1), requirement=Requirement(name='torch', url=None, extras=frozenset(), specifier=<SpecifierSet('')>, marker=None), editable=False): 0.1ms
pex:       Installing 22 distributions: 12.4ms
pex: Re-writing /home/jsirois/dev/pantsbuild/jsirois-pex/tenv/bin/convert-caffe2-to-onnx
pex: Re-writing /home/jsirois/dev/pantsbuild/jsirois-pex/tenv/bin/convert-onnx-to-caffe2
pex: Re-writing /home/jsirois/dev/pantsbuild/jsirois-pex/tenv/bin/isympy
pex: Re-writing /home/jsirois/dev/pantsbuild/jsirois-pex/tenv/bin/torchrun
pex: Installing 22 wheels in venv at ./tenv: 3460.6ms

real    0m3.855s
user    0m3.105s
sys     0m0.751s
jsirois commented 9 months ago

I note that that there's often other hashes related to wheels flying around, e.g. in a lockfile, or as values of the distributions map in PEX-INFO. Is there any scope for using those directly?

Yes. That's the thrust of the last sentence of my comment to @zmanji here: https://github.com/pantsbuild/pex/pull/2315#discussion_r1442476990