prefix-dev / rip

Solve and install Python packages quickly with rip (pip in Rust)
https://prefix.dev
BSD 3-Clause "New" or "Revised" License
645 stars 23 forks source link

Rip collects wheels unrelated to the platform it is running on #39

Closed notatallshaw closed 1 year ago

notatallshaw commented 1 year ago

This is possibly a misunderstanding on my side but on Linux on Python 3.8 when I run:

$ python3.8 -m venv .venv
$ source .venv/bin/activate
$ cargo r -- PyWavelets

I get a lot of output of it collecting wheels related to MacOS and Windows and Python versions that I was not on:

    Finished dev [unoptimized + debuginfo] target(s) in 0.07s
     Running `target/debug/rip PyWavelets`
2023-10-02T23:41:07.956549Z  INFO rip: cache directory: /home/damian/.cache/rattler/pypi
2023-10-02T23:41:07.996443Z  INFO rip: extracted the following environment markers from the system python interpreter:
Pep508EnvMakers {
    os_name: "posix",
    sys_platform: "linux",
    platform_machine: "x86_64",
    platform_python_implementation: "CPython",
    platform_release: "5.15.90.1-microsoft-standard-WSL2",
    platform_system: "Linux",
    platform_version: "#1 SMP Fri Jan 27 02:56:13 UTC 2023",
    python_version: "3.8",
    python_full_version: "3.8.18",
    implementation_name: "cpython",
    implementation_version: "3.8.18",
}
2023-10-02T23:41:07.998434Z  INFO rattler_installs_packages::resolve: collecting pywavelets
2023-10-02T23:41:07.998542Z  INFO rattler_installs_packages::http: executing request url=https://pypi.org/simple/pywavelets/ cache_mode=Default
2023-10-02T23:41:08.132606Z  WARN rattler_installs_packages::resolve: Not considering pywavelets 0.2.2, 0.2.0, 0.1.6, 0.1.4, 0.1.2 because there are no wheel artifacts available
2023-10-02T23:41:08.132798Z  INFO rattler_installs_packages::resolve: obtaining dependency information from pywavelets=1.4.1
2023-10-02T23:41:08.133125Z  INFO rattler_installs_packages::resolve: collecting numpy
2023-10-02T23:41:08.133187Z  INFO rattler_installs_packages::http: executing request url=https://pypi.org/simple/numpy/ cache_mode=Default
2023-10-02T23:41:08.310962Z  WARN rattler_installs_packages::resolve: Not considering numpy 1.26.0rc1, 1.26.0b1, 1.25.0rc1, 1.24.0rc2, 1.24.0rc1, 1.23.0rc3, 1.23.0rc2, 1.23.0rc1, 1.10.0.post2, 1.5.0, 1.4.1, 1.3.0 because there are no wheel artifacts available
2023-10-02T23:41:08.311069Z  WARN rattler_installs_packages::resolve: Not considering numpy 1.26.0, 1.25.2, 1.25.1, 1.25.0 because none of the artifacts are compatible with Python 3.8.18
2023-10-02T23:41:08.313444Z  INFO rattler_installs_packages::resolve: obtaining dependency information from numpy=1.24.4
2023-10-02T23:41:08.313789Z  INFO rattler_installs_packages::http: executing request url=https://files.pythonhosted.org/packages/6b/80/6cdfb3e275d95155a34659163b83c09e3a3ff9f1456880bec6cc63d71083/numpy-1.24.4-cp310-cp310-macosx_10_9_x86_64.whl#sha256=c0bfb52d2169d58c1cdb8cc1f16989101639b34c7d3ce60ed70b19c63eba0b64 cache_mode=OnlyIfCached
2023-10-02T23:41:08.314588Z  INFO rattler_installs_packages::http: executing request url=https://files.pythonhosted.org/packages/64/5f/3f01d753e2175cfade1013eea08db99ba1ee4bdb147ebcf3623b75d12aa7/numpy-1.24.4-cp310-cp310-macosx_11_0_arm64.whl#sha256=ed094d4f0c177b1b8e7aa9cba7d6ceed51c0e569a5318ac0ca9a090680a6a1b1 cache_mode=OnlyIfCached
2023-10-02T23:41:08.314751Z  INFO rattler_installs_packages::http: executing request url=https://files.pythonhosted.org/packages/5a/b3/2f9c21d799fa07053ffa151faccdceeb69beec5a010576b8991f614021f7/numpy-1.24.4-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl#sha256=79fc682a374c4a8ed08b331bef9c5f582585d1048fa6d80bc6c35bc384eee9b4 cache_mode=OnlyIfCached
2023-10-02T23:41:08.315047Z  INFO rattler_installs_packages::http: executing request url=https://files.pythonhosted.org/packages/10/be/ae5bf4737cb79ba437879915791f6f26d92583c738d7d960ad94e5c36adf/numpy-1.24.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl#sha256=7ffe43c74893dbf38c2b0a1f5428760a1a9c98285553c89e12d70a96a7f3a4d6 cache_mode=OnlyIfCached
2023-10-02T23:41:08.315178Z  INFO rattler_installs_packages::http: executing request url=https://files.pythonhosted.org/packages/c0/64/908c1087be6285f40e4b3e79454552a701664a079321cff519d8c7051d06/numpy-1.24.4-cp310-cp310-win32.whl#sha256=4c21decb6ea94057331e111a5bed9a79d335658c27ce2adb580fb4d54f2ad9bc cache_mode=OnlyIfCached
2023-10-02T23:41:08.315304Z  INFO rattler_installs_packages::http: executing request url=https://files.pythonhosted.org/packages/22/55/3d5a7c1142e0d9329ad27cece17933b0e2ab4e54ddc5c1861fbfeb3f7693/numpy-1.24.4-cp310-cp310-win_amd64.whl#sha256=b4bea75e47d9586d31e892a7401f76e909712a0fd510f58f5337bea9572c571e cache_mode=OnlyIfCached
2023-10-02T23:41:08.315636Z  INFO rattler_installs_packages::http: executing request url=https://files.pythonhosted.org/packages/a9/cc/5ed2280a27e5dab12994c884f1f4d8c3bd4d885d02ae9e52a9d213a6a5e2/numpy-1.24.4-cp311-cp311-macosx_10_9_x86_64.whl#sha256=f136bab9c2cfd8da131132c2cf6cc27331dd6fae65f95f69dcd4ae3c3639c810 cache_mode=OnlyIfCached
2023-10-02T23:41:08.315787Z  INFO rattler_installs_packages::http: executing request url=https://files.pythonhosted.org/packages/c0/bc/77635c657a3668cf652806210b8662e1aff84b818a55ba88257abf6637a8/numpy-1.24.4-cp311-cp311-macosx_11_0_arm64.whl#sha256=e2926dac25b313635e4d6cf4dc4e51c8c0ebfed60b801c799ffc4c32bf3d1254 cache_mode=OnlyIfCached
2023-10-02T23:41:08.315932Z  INFO rattler_installs_packages::http: executing request url=https://files.pythonhosted.org/packages/a7/4c/96cdaa34f54c05e97c1c50f39f98d608f96f0677a6589e64e53104e22904/numpy-1.24.4-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl#sha256=222e40d0e2548690405b0b3c7b21d1169117391c2e82c378467ef9ab4c8f0da7 cache_mode=OnlyIfCached
2023-10-02T23:41:08.316210Z  INFO rattler_installs_packages::http: executing request url=https://files.pythonhosted.org/packages/22/97/dfb1a31bb46686f09e68ea6ac5c63fdee0d22d7b23b8f3f7ea07712869ef/numpy-1.24.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl#sha256=7215847ce88a85ce39baf9e89070cb860c98fdddacbaa6c0da3ffb31b3350bd5 cache_mode=OnlyIfCached
2023-10-02T23:41:08.316511Z  INFO rattler_installs_packages::http: executing request url=https://files.pythonhosted.org/packages/35/e2/76a11e54139654a324d107da1d98f99e7aa2a7ef97cfd7c631fba7dbde71/numpy-1.24.4-cp311-cp311-win32.whl#sha256=4979217d7de511a8d57f4b4b5b2b965f707768440c17cb70fbf254c4b225238d cache_mode=OnlyIfCached
2023-10-02T23:41:08.316642Z  INFO rattler_installs_packages::http: executing request url=https://files.pythonhosted.org/packages/d8/ec/ebef2f7d7c28503f958f0f8b992e7ce606fb74f9e891199329d5f5f87404/numpy-1.24.4-cp311-cp311-win_amd64.whl#sha256=b7b1fc9864d7d39e28f41d089bfd6353cb5f27ecd9905348c24187a768c79694 cache_mode=OnlyIfCached
2023-10-02T23:41:08.316786Z  INFO rattler_installs_packages::http: executing request url=https://files.pythonhosted.org/packages/11/10/943cfb579f1a02909ff96464c69893b1d25be3731b5d3652c2e0cf1281ea/numpy-1.24.4-cp38-cp38-macosx_10_9_x86_64.whl#sha256=1452241c290f3e2a312c137a9999cdbf63f78864d63c79039bda65ee86943f61 cache_mode=OnlyIfCached
2023-10-02T23:41:08.316953Z  INFO rattler_installs_packages::http: executing request url=https://files.pythonhosted.org/packages/a7/ae/f53b7b265fdc701e663fbb322a8e9d4b14d9cb7b2385f45ddfabfc4327e4/numpy-1.24.4-cp38-cp38-macosx_11_0_arm64.whl#sha256=04640dab83f7c6c85abf9cd729c5b65f1ebd0ccf9de90b270cd61935eef0197f cache_mode=OnlyIfCached
2023-10-02T23:41:08.317098Z  INFO rattler_installs_packages::http: executing request url=https://files.pythonhosted.org/packages/25/6f/2586a50ad72e8dbb1d8381f837008a0321a3516dfd7cb57fc8cf7e4bb06b/numpy-1.24.4-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl#sha256=a5425b114831d1e77e4b5d812b69d11d962e104095a5b9c3b641a218abcc050e cache_mode=OnlyIfCached
2023-10-02T23:41:08.317238Z  INFO rattler_installs_packages::http: executing request url=https://files.pythonhosted.org/packages/98/5d/5738903efe0ecb73e51eb44feafba32bdba2081263d40c5043568ff60faf/numpy-1.24.4-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl#sha256=dd80e219fd4c71fc3699fc1dadac5dcf4fd882bfc6f7ec53d30fa197b8ee22dc cache_mode=OnlyIfCached
2023-10-02T23:41:08.317383Z  INFO rattler_installs_packages::http: executing request url=https://files.pythonhosted.org/packages/d1/57/8d328f0b91c733aa9aa7ee540dbc49b58796c862b4fbcb1146c701e888da/numpy-1.24.4-cp38-cp38-win32.whl#sha256=4602244f345453db537be5314d3983dbf5834a9701b7723ec28923e2889e0bb2 cache_mode=OnlyIfCached
2023-10-02T23:41:08.317498Z  INFO rattler_installs_packages::http: executing request url=https://files.pythonhosted.org/packages/69/65/0d47953afa0ad569d12de5f65d964321c208492064c38fe3b0b9744f8d44/numpy-1.24.4-cp38-cp38-win_amd64.whl#sha256=692f2e0f55794943c5bfff12b3f56f99af76f902fc47487bdfe97856de51a706 cache_mode=OnlyIfCached
2023-10-02T23:41:08.317650Z  INFO rattler_installs_packages::http: executing request url=https://files.pythonhosted.org/packages/9a/cd/d5b0402b801c8a8b56b04c1e85c6165efab298d2f0ab741c2406516ede3a/numpy-1.24.4-cp39-cp39-macosx_10_9_x86_64.whl#sha256=2541312fbf09977f3b3ad449c4e5f4bb55d0dbf79226d7724211acc905049400 cache_mode=OnlyIfCached
2023-10-02T23:41:08.317919Z  INFO rattler_installs_packages::http: executing request url=https://files.pythonhosted.org/packages/14/27/638aaa446f39113a3ed38b37a66243e21b38110d021bfcb940c383e120f2/numpy-1.24.4-cp39-cp39-macosx_11_0_arm64.whl#sha256=9667575fb6d13c95f1b36aca12c5ee3356bf001b714fc354eb5465ce1609e62f cache_mode=OnlyIfCached
2023-10-02T23:41:08.318066Z  INFO rattler_installs_packages::http: executing request url=https://files.pythonhosted.org/packages/8f/27/91894916e50627476cff1a4e4363ab6179d01077d71b9afed41d9e1f18bf/numpy-1.24.4-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl#sha256=f3a86ed21e4f87050382c7bc96571755193c4c1392490744ac73d660e8f564a9 cache_mode=OnlyIfCached
2023-10-02T23:41:08.318189Z  INFO rattler_installs_packages::http: executing request url=https://files.pythonhosted.org/packages/7a/7c/d7b2a0417af6428440c0ad7cb9799073e507b1a465f827d058b826236964/numpy-1.24.4-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl#sha256=d11efb4dbecbdf22508d55e48d9c8384db795e1b7b51ea735289ff96613ff74d cache_mode=OnlyIfCached
2023-10-02T23:41:08.318481Z  INFO rattler_installs_packages::http: executing request url=https://files.pythonhosted.org/packages/18/9d/e02ace5d7dfccee796c37b995c63322674daf88ae2f4a4724c5dd0afcc91/numpy-1.24.4-cp39-cp39-win32.whl#sha256=6620c0acd41dbcb368610bb2f4d83145674040025e5536954782467100aa8835 cache_mode=OnlyIfCached
2023-10-02T23:41:08.318760Z  INFO rattler_installs_packages::http: executing request url=https://files.pythonhosted.org/packages/63/38/6cc19d6b8bfa1d1a459daf2b3fe325453153ca7019976274b6f33d8b5663/numpy-1.24.4-cp39-cp39-win_amd64.whl#sha256=befe2bf740fd8373cf56149a5c23a0f601e82869598d41f8e188a0e9869926f8 cache_mode=OnlyIfCached
2023-10-02T23:41:08.318910Z  INFO rattler_installs_packages::http: executing request url=https://files.pythonhosted.org/packages/a4/fd/8dff40e25e937c94257455c237b9b6bf5a30d42dd1cc11555533be099492/numpy-1.24.4-pp38-pypy38_pp73-macosx_10_9_x86_64.whl#sha256=31f13e25b4e304632a4619d0e0777662c2ffea99fcae2029556b17d8ff958aef
 cache_mode=OnlyIfCached
2023-10-02T23:41:08.319053Z  INFO rattler_installs_packages::http: executing request url=https://files.pythonhosted.org/packages/42/e7/4bf953c6e05df90c6d351af69966384fed8e988d0e8c54dad7103b59f3ba/numpy-1.24.4-pp38-pypy38_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl#sha256=95f7ac6540e95bc440ad77f56e520da5bf877f87dca58bd095288dce8940532a cache_mode=OnlyIfCached
2023-10-02T23:41:08.319171Z  INFO rattler_installs_packages::http: executing request url=https://files.pythonhosted.org/packages/fc/dd/9106005eb477d022b60b3817ed5937a43dad8fd1f20b0610ea8a32fcb407/numpy-1.24.4-pp38-pypy38_pp73-win_amd64.whl#sha256=e98f220aa76ca2a977fe435f5b04d7b3470c0a2e6312907b37ba6068f26787f2 cache_mode=OnlyIfCached
2023-10-02T23:41:08.319473Z  INFO rattler_installs_packages::http: executing request url=https://files.pythonhosted.org/packages/6b/80/6cdfb3e275d95155a34659163b83c09e3a3ff9f1456880bec6cc63d71083/numpy-1.24.4-cp310-cp310-macosx_10_9_x86_64.whl.metadata#sha256=c0bfb52d2169d58c1cdb8cc1f16989101639b34c7d3ce60ed70b19c63eba0b64 cache_mode=NoStore
Resolved environment:
- PyWavelets

Name        Version
numpy       1.24.4
pywavelets  1.4.1

e.g.

executing request url=https://files.pythonhosted.org/packages/6b/80/6cdfb3e275d95155a34659163b83c09e3a3ff9f1456880bec6cc63d71083/numpy-1.24.4-cp310-cp310-macosx_10_9_x86_64.whl.metadata#sha256=c0bfb52d2169d58c1cdb8cc1f16989101639b34c7d3ce60ed70b19c63eba0b64 

Is this expected? It appears to be slowing rip down when the wheels aren't cached and it's having to do a lot of backtracking, I noticed it particularly when trying this spec on Python 3.8 (Windows or Linux):

"numpy==1.21.6" "cython==0.29.28" "scipy>=1.4.0" "torch>=1.7" "torchaudio" "soundfile" "librosa==0.10.0.*" "numba==0.55.1" "inflect==5.6.0" "tqdm" "anyascii" "pyyaml" "fsspec>=2021.04.0" "aiohttp" "packaging" "flask" "pysbd" "pandas" "matplotlib" "trainer==0.0.20" "coqpit>=0.0.16" "pypinyin" "mecab-python3==1.0.5" "jamo" "bangla==0.0.2" "k_diffusion" "einops" "transformers"
baszalmstra commented 1 year ago

It is expected, although we can probably filter out a bunch of artifacts (related to #3). The idea is that when we try to gather metadata for a particular version of a package we check if we have any artifact already cached. (Note the cache_mode=OnlyIfCached). This only queries the disk cache for existence and should be fairly quick. The metadata for all artifacts "should" be the same so this seems like a nice optimization.

Looking at your logging statement it appears it takes 5ms (08.319171 - 08.313789) to check the cache which seems reasonable. We can probably also do concurrent requests but I'm not sure that will be much faster.

baszalmstra commented 1 year ago

I also noticed you run the debug build, for measuring performance it might be worth it to run the binary in release mode:

cargo r --release -- PyWavelets
baszalmstra commented 1 year ago

With #42 we now only consider wheels that are relevant to the installed python interpreter.

notatallshaw commented 1 year ago

Oh wow, that was fast, I was going to write a quick script to see how much real world performance inpact it had.

While 5ms isn't much on its own, you potentially doing it many thousand of times for a complex backtrack.

baszalmstra commented 1 year ago

While 5ms isn't much on its own, you potentially doing it many thousand of times for a complex backtrack.

Thats true! I imagine that in the future we could simply parallelize these things. Closing for now since #42 has been merged.