Open karlicoss opened 1 year ago
happens a lot lately, definitely worth trying with retries
Been flaky really often, so took another look at this -- essentially it's flaky because pip not really meant to be running in parallel, e.g. it can observe intermediate metadata state from other package and crash.
I thought we could benefit a lot at least from downloading in parallel https://github.com/pypa/pip/issues/825. There is a --use-feature=fast-deps
which I think should provide parallel downloads -- but it didn't seem to do any impact, not sure why.
I took a look what's taking so much time using mypy-misc
pipeline in tox as an example -- it's installing many modules, but without parallel mode to track time taken at each step. The bulk of it is:
[2023-11-19 17:29:46] Collecting git+https://github.com/karlicoss/ghexport
[2023-11-19 17:29:46] Cloning https://github.com/karlicoss/ghexport to /tmp/pip-req-build-01m1xmtv
[2023-11-19 17:29:46] Running command git clone --filter=blob:none --quiet https://github.com/karlicoss/ghexport /tmp/pip-req-build-01m1xmtv
[2023-11-19 17:29:47] Resolved https://github.com/karlicoss/ghexport to commit 03207b63da4a0f570700f373867ff67deb4f43d1
[2023-11-19 17:29:47] Running command git submodule update --init --recursive -q
[2023-11-19 17:29:48] Installing build dependencies: started
[2023-11-19 17:29:49] Installing build dependencies: finished with status 'done'
[2023-11-19 17:29:49] Getting requirements to build wheel: started
[2023-11-19 17:29:50] Getting requirements to build wheel: finished with status 'done'
[2023-11-19 17:29:50] Installing backend dependencies: started
[2023-11-19 17:29:51] Installing backend dependencies: finished with status 'done'
[2023-11-19 17:29:51] Preparing metadata (pyproject.toml): started
[2023-11-19 17:29:51] Preparing metadata (pyproject.toml): finished with status 'done'
[2023-11-19 17:29:51] Collecting git+https://github.com/karlicoss/goodrexport
[2023-11-19 17:29:51] Cloning https://github.com/karlicoss/goodrexport to /tm
h3
dependency from timezonefinder is very slow? building its wheel is taking like 30 seconds?The above sums up to about 2.5 minutes, which is like 80% of time the whole tox pipeline is running, so worth digging how to optimize this. In the meantime going to disable parallel install on ci, since it fails so much it defeats the purpose basically (to merge things faster)
For h3
, seems that it doesn't have 3.12 wheels, so it's building them from scratch, and as a result pip3 install --user --force-reinstall --no-cache h3 -vvv
can take like 60 seconds on osx
https://pypi.org/project/h3/#files
possibly relevant https://github.com/uber/h3-py/issues/326
pip debug --verbose
is useful to find out compatible wheel tags
For git repositories it's basically the same issue -- there are no wheels so it takes a while to build them. Kinda odd that there is no builtin support for parallelization during wheel building. There are two options as I see it:
hpi module install
, clone and build wheels manually in parallel -- after that use normal serial pip install against these wheels. It won't handle any transitive dependencies thoughI think with uv perhaps this won't be an issue anymore
E.g. see here https://github.com/karlicoss/HPI/pull/304 and we had some other issues with it before I guess it could use some stress testing to figure out what actually results in race conditions, or perhaps make it a bit more defensive (add some retries) -- at least during CI this should be acceptable