brettcannon / mousebender

Create reproducible installations for a virtual environment from a lock file
https://pypi.org/project/mousebender/
BSD 3-Clause "New" or "Revised" License
83 stars 13 forks source link

Repo index page w/ trailing slash in CDATA not supported #19

Open brettcannon opened 4 years ago

brettcannon commented 4 years ago

https://tensorflow.pypi.thoth-station.ninja/index/manylinux2010/AVX2/simple/ has CDATA with trailing slashes. I have not tested to see if this is page is supported by pip.

brettcannon commented 4 years ago

BTW this page technically violates PEP 503 due to the CDATA not matching the project name.

brettcannon commented 4 years ago

@pradyunsg where is the parsing code in pip for --index-url? I want to see how permissive it is of what's in the CDATA.

pradyunsg commented 4 years ago

@pradyunsg where is the parsing code in pip for --index-url?

pip._internal.index.collector:parse_links (line 352, as of today on master).

I want to see how permissive it is of what's in the CDATA.

pip doesn't look up what's in the CDATA today. I'm onboard for making it start warning about such discrepancies, and after a 2-release deprecation cycle, rejecting such pages.

brettcannon commented 4 years ago

https://github.com/pypa/pip/blob/d53e880cfec7f30636db4183d9c7eb89e7aa2d3f/src/pip/_internal/index/collector.py#L351-L374

brettcannon commented 4 years ago

Interesting that pip doesn't look at the CDATA as that seems to be the only way to potentially differentiate between projects who have a name clash when it comes to name normalization for the URL (although that is obviously a "don't do that" kind of thing if your names differ only by symbols that will be stripped out 😉 ). It also makes searching near impossible based on people specifying the full name and not the normalized name.

Really shows how little people use the index part of PEP 503.

pradyunsg commented 4 years ago

projects who have a name clash when it comes to name normalization for the URL

Well... PyPI doesn't allow such projects so we never hit that. :)

pradyunsg commented 4 years ago

I stand corrected! I just spent some time thinking about this, saw https://github.com/python-poetry/poetry/issues/1983 and realized that pip does indeed perform these checks, but only when the file extension is .whl. That's likely because we don't have any standard way to get the name out of the source distributions.

https://github.com/pypa/pip/blob/19665791fa81e3e540721928430e656d167fe307/src/pip/_internal/index/package_finder.py#L192-L195

AhmTaher commented 2 months ago

Y