Open matthewfeickert opened 8 months ago
Relevant thing to point to (c.f. https://github.com/pypa/pip/issues/8606#issuecomment-1776000697) is PEP 708 – Extending the Repository API to Mitigate Dependency Confusion Attacks (currently unimplimented).
Note that uv
differs from pip
here in that it does give higher priority to --extra-index-url (which is different than I would have assumed!).
From uv pip install --help
:
...
-i, --index-url <INDEX_URL>
The URL of the Python package index (by default: <https://pypi.org/simple>).
The index given by this flag is given lower priority than all other indexes specified via the `--extra-index-url` flag.
Unlike `pip`, `uv` will stop looking for versions of a package as soon as it finds it in an index. That is, it isn't possible for `uv` to consider versions of the same package across multiple indexes.
[env: UV_INDEX_URL=]
--extra-index-url <EXTRA_INDEX_URL>
Extra URLs of package indexes to use, in addition to `--index-url`.
All indexes given via this flag take priority over the index in `--index-url` (which defaults to PyPI). And when multiple `--extra-index-url` flags are given, earlier values take priority.
Unlike `pip`, `uv` will stop looking for versions of a package as soon as it finds it in an index. That is, it isn't possible for `uv` to consider versions of the same package across multiple indexes.
[env: UV_EXTRA_INDEX_URL=]
...
I'm curious if that's intended. I thought that their aim was to be a drop-in replacement.
I'm curious if that's intended.
I think it is, though we could easily check (and ask).
I thought that their aim was to be a drop-in replacement.
I believe the goal is for functionality to be achieved, not for design choices to be replicated. In the limitations section of the README they note
Limitations
While uv supports a large subset of the
pip
interface, it does not support the entire feature set. In some cases, those differences are intentional; in others, they're a result of uv's early stage of development.For details, see our
pip
compatibility guide.
Yeah, it is intentional given https://github.com/astral-sh/uv/pull/2083.
If there are issue with the order of index
es, one alternative would be for scientific python to have a "proxy index", that merges the json of multiple upstream index.
I'm wondering if a proxy index like that would even need a permanent server or could be hosted purely on an edge cloud as it is likely stateless.
@Carreau do you have examples of these proxy indexes? I haven't heard of this before, so it would be interesting to see how they work.
Hum, it's theoretical, but basically you fetch multiple <repos>/simple/<package>
and merge the pages. I believe I might talk about this with @ivanov last June in seattle.
Think:
import flask
import request
import bs4
app = flask.app(__file__)
repos = ['https://pypi.org/', 'https://nightly.com/']
@app.route('/simple/<package>')
def simple(package):
pages = [requests.get(r+'/simple/'+package) for r in repos]
bodys = [bs4.parse(p).body for p in pages]
return HEAD + concat(bodys) + FOOTER
That's the legacy API – I think anaconda.org only have it – but there is a new JSON API as well, so we might need some work to figure out the details.
We should only need to serve indexes, as the download come from somewhere else.
Here is a poc using flask: https://github.com/Carreau/multi-index which does work locally.
It's sync, but it should not be hard to make it async with various caching.
See https://github.com/Carreau/cloudflare-pypi-multi-index deployed on cloudflare workers on https://nightly.carreau.workers.dev/nightly
$ pip install --index-url https://nightly.carreau.workers.dev/nightly --pre --upgrade ipython
Should now just work and "merge" Pypi and https://pypi.anaconda.org/scientific-python-nightly-wheels/simple/, we should be able to add any other PyPI mirror or have crazy things like /random
return only a subset of the wheel, or /binary
, strip all the tgz if whl are present.. for example.
If it's of interest we could have that owned by the tools team, which would simplify the above (and we could have usage metrics...)
This is pretty great @Carreau! Thank you for building this.
$ docker run --rm -ti python:3.12 /bin/bash
root@8cf753ac7bdf:/# python -m venv venv && . venv/bin/activate
(venv) root@8cf753ac7bdf:/# python -m pip --quiet install --upgrade pip wheel
(venv) root@8cf753ac7bdf:/# python -m pip install --index-url https://nightly.carreau.workers.dev/nightly --pre --upgrade matplotlib
Looking in indexes: https://nightly.carreau.workers.dev/nightly
Collecting matplotlib
Downloading https://pypi.anaconda.org/scientific-python-nightly-wheels/simple/matplotlib/3.10.0.dev252%2Bg7ccfd3813b/matplotlib-3.10.0.dev252%2Bg7ccfd3813b-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (8.3 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 8.3/8.3 MB 17.8 MB/s eta 0:00:00
Collecting contourpy>=1.0.1 (from matplotlib)
Downloading https://pypi.anaconda.org/scientific-python-nightly-wheels/simple/contourpy/1.3.0.dev1/contourpy-1.3.0.dev1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (320 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 320.2/320.2 kB 15.1 MB/s eta 0:00:00
Collecting cycler>=0.10 (from matplotlib)
Downloading cycler-0.12.1-py3-none-any.whl.metadata (3.8 kB)
Collecting fonttools>=4.22.0 (from matplotlib)
Downloading fonttools-4.53.0-cp312-cp312-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (162 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 162.2/162.2 kB 1.7 MB/s eta 0:00:00
Collecting kiwisolver>=1.3.1 (from matplotlib)
Downloading kiwisolver-1.4.5-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (6.4 kB)
Collecting numpy>=1.23 (from matplotlib)
Downloading https://pypi.anaconda.org/scientific-python-nightly-wheels/simple/numpy/2.1.0.dev0/numpy-2.1.0.dev0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (19.3 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 19.3/19.3 MB 25.2 MB/s eta 0:00:00
Collecting packaging>=20.0 (from matplotlib)
Downloading packaging-24.0-py3-none-any.whl.metadata (3.2 kB)
Collecting pillow>=8 (from matplotlib)
Downloading pillow-10.3.0-cp312-cp312-manylinux_2_28_x86_64.whl.metadata (9.2 kB)
Collecting pyparsing>=2.3.1 (from matplotlib)
Downloading pyparsing-3.1.2-py3-none-any.whl.metadata (5.1 kB)
Collecting python-dateutil>=2.7 (from matplotlib)
Downloading python_dateutil-2.9.0.post0-py2.py3-none-any.whl.metadata (8.4 kB)
Collecting six>=1.5 (from python-dateutil>=2.7->matplotlib)
Downloading six-1.16.0-py2.py3-none-any.whl.metadata (1.8 kB)
Downloading cycler-0.12.1-py3-none-any.whl (8.3 kB)
Downloading fonttools-4.53.0-cp312-cp312-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.9 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 4.9/4.9 MB 28.5 MB/s eta 0:00:00
Downloading kiwisolver-1.4.5-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.5 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.5/1.5 MB 63.5 MB/s eta 0:00:00
Downloading packaging-24.0-py3-none-any.whl (53 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 53.5/53.5 kB 6.0 MB/s eta 0:00:00
Downloading pillow-10.3.0-cp312-cp312-manylinux_2_28_x86_64.whl (4.5 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 4.5/4.5 MB 40.0 MB/s eta 0:00:00
Downloading pyparsing-3.1.2-py3-none-any.whl (103 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 103.2/103.2 kB 16.0 MB/s eta 0:00:00
Downloading python_dateutil-2.9.0.post0-py2.py3-none-any.whl (229 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 229.9/229.9 kB 27.6 MB/s eta 0:00:00
Downloading six-1.16.0-py2.py3-none-any.whl (11 kB)
Installing collected packages: six, pyparsing, pillow, packaging, numpy, kiwisolver, fonttools, cycler, python-dateutil, contourpy, matplotlib
Successfully installed contourpy-1.3.0.dev1 cycler-0.12.1 fonttools-4.53.0 kiwisolver-1.4.5 matplotlib-3.10.0.dev252+g7ccfd3813b numpy-2.1.0.dev0 packaging-24.0 pillow-10.3.0 pyparsing-3.1.2 python-dateutil-2.9.0.post0 six-1.16.0
(venv) root@8cf753ac7bdf:/# python -m pip list
Package Version
--------------- -------------------------
contourpy 1.3.0.dev1
cycler 0.12.1
fonttools 4.53.0
kiwisolver 1.4.5
matplotlib 3.10.0.dev252+g7ccfd3813b
numpy 2.1.0.dev0
packaging 24.0
pillow 10.3.0
pip 24.0
pyparsing 3.1.2
python-dateutil 2.9.0.post0
six 1.16.0
wheel 0.43.0
(venv) root@8cf753ac7bdf:/#
If it's of interest we could have that owned by the tools team, which would simplify the above (and we could have usage metrics...)
SGTM. Does @scientific-python/tools-team agree?
The only question I have is that I assume that you're currently paying for the any Cloudflare expenses for this at the moment, and while it seems like currently
the first 100,000 requests each day are free
for Cloudflare, which is good, can we make it so that your payment details don't ever get charged?
Yeah careful with Cloudflare pricing. There has been some articles where people complained about sudden changes and in the end they had to rewrite their whole integration because they could not pay...
For the index thing. I had also that issue/concern at work. In the end I ended up checking that the SHA in my lockfile was present on the index I wanted (and being strict about not following redirections). Something I hope I can ditch with uv once it's stable.
can we make it so that your payment details don't ever get charged
I don't think I have payment methods setup on cloudflare.
I also think that scientific-python should have a cloudflare account and the tools team should get delegated access to it anyway, and make sure it works well to delegate and we know how to debug/deploy/track metrics.
I think we can / should play with this a bit before recommending it also. In particular my POC does not support the /json
because anaconda nightly upload channel does not.
What does happen when the free requests run out? Automatic charges, or access denied requests? If the latter, then I would say do the migration as soon as you can.
https://developers.cloudflare.com/workers/platform/pricing/ and https://developers.cloudflare.com/workers/platform/limits/ suggest that it's 100,000 reset at midnight every day that stop working once passed the limit.
I think there are also 2 things:
I think we should do 1 anyway as urls do reflect the org, and I would prefer to use scientifc-python.workers.dev
instead of carreau.workers.dev
Paid plan we can discuss later, as there is a flat rate of $5/month that bump us from 100k to 10M regardless of wether you are under or above 100k requests.
I fully support the plan above.
Clarify the need(?) and use of
--index-url
and--extra-index-url
. I might have over complicated some of the instructions given a misunderstanding of priority, but there seems to be no priority and no way to enforce priority. This I think lead to some of the confusion in https://github.com/scientific-python/upload-nightly-action/issues/41.References:
pip install
docs "Finding Packages" section