Pick the first wheel matching criteria when using multiple index URLs

smonani commented 5 years ago

What's the problem this feature will solve? We have an implementation of a multi-level custom PyPi with 4 or more levels. So here is how our typical package install flow works:

pip install --extra-index-url --extra-index-url --extra-index-url --extra-index-url
Look for matching packages with the package spec in all the extra index URLs and the default index URL
Install the package version from where ever it was found.

However, when doing this, we have noticed that pip takes exponentially longer as you keep on adding more extra index URLs to the call.

Describe the solution you'd like While I would not like to change PIPs default behavior, I wanted to suggest addition of an optional parameter that acts as a switch and does the following when provided: Instead of searching all the extra index URLs as well as the index URLs and then install the matching package at the end, it should stop the search once the first matching entry is found, and install that package.

I was thinking of calling it --install-first-result, but we can choose any name that works.

Alternative Solutions We have tried making our PyPi sources faster, but in many cases, they are geographically distributed, so it can get only so fast without being prohibitively expensive.

Additional context If this feature isn't high on priority, I an open to submitting a PR. Any pointers about where the code that performs this would be very helpful in that case

cjerdonek commented 5 years ago

Do you really mean "exponentially" longer, or just "linearly" longer?

smonani commented 5 years ago

It's not strictly exponential, but does seem to be a lot more than linear. In our testing for example, just one pypi source takes approx 30 seconds to install a package with all the dependencies. Adding 3 new extra index urls increases the time to 30 minutes instead of 2 minutes.

cjerdonek commented 5 years ago

Can you try using the --log option to find out where the time is going / what is taking long? (New versions of pip include timestamps at each line.) E.g. maybe some of the index urls have a slower internet connection.

pypa / pip

Pick the first wheel matching criteria when using multiple index URLs #6337