populationgenomics / production-pipelines

Genomics workflows for CPG using Hail Batch
MIT License
2 stars 0 forks source link

Pin this random package to avoid pip dependency resolution backtracking #749

Closed jmarshall closed 1 month ago

jmarshall commented 1 month ago

Initialising dataproc jobs is currently timing out, and it appears to be due to pip taking forever to install our suite of packages.

I've reproduced pip taking forever to install cpg-workflows (alone) on a local Linux VM, and by reading this pip info was able to resolve it by pinning to allow only recent versions of the first package that I saw pip trying to download dozens of versions of in its attempts to satisfy vague constraints.

Adding this reduced the time for pip install cog-workflows to < 1 minute, but there may be additional problems as well when other packages are added to the mix…

Background: https://centrepopgen.slack.com/archives/C030X7WGFCL/p1715819263267179

This seems to have some similarity to #510, in which we had to add some packages we had never heard of to ensure prerequisites of prerequisites were installed and the tests would run.