mozilla / pipstrap

INACTIVE - http://mzl.la/ghe-archive - A small script that can act as a trust root for installing pip 8
MIT License
21 stars 13 forks source link

pipstrap behind great firewall of china #8

Closed kevinkle closed 7 years ago

kevinkle commented 7 years ago

Looking to move discussion of https://github.com/certbot/certbot/pull/4711 to here as suggested by @zjs @bmw. @erikrose - the issue in certbot is that users' have http://pypi.pyton.org/ blocked. Is there a recommended way to approach this while still using pipstrap to hash check pip?

erikrose commented 7 years ago

I think the best way to go may be an environment variable which contains the analogue of "https://pypi.python.org". This represents a change of mind from my previous opinion, which held env vars too invisible for such a security-sensitive application. However, the ease with which we could pass params into embedded runs of pipstrap (as certbot-auto uses) and the fact that we can mitigate the invisibility with a printed advisory ("Hey, you seem to be using an alternate PyPI address: www.smoo.com. If that isn't what you want, panic.") make them compelling.

erikrose commented 7 years ago

In summary, I think the env var, along with a helpful printed message (as in your https://github.com/certbot/certbot/pull/4711) when the pypi.python.org requests fail, would be a fine solution to this.

erikrose commented 7 years ago

Working on this now. But do you know, for testing purposes, of any good PyPI mirrors accessible from China? I've tried a few from https://pypi-mirrors.org/, but they seem to have empty dirs where even common packages like pip should be.

kevinkle commented 7 years ago

Hey Erik, it looks like folks are using http://mirrors.aliyun.com/pypi/simple/ From https://github.com/certbot/certbot/issues/2516

On Jun 7, 2017 3:28 PM, "Erik Rose" notifications@github.com wrote:

Working on this now. But do you know, for testing purposes, of any good PyPI mirrors accessible from China? I've tried a few from https://pypi-mirrors.org/, but they seem to have empty dirs where even common packages like pip should be.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/erikrose/pipstrap/issues/8#issuecomment-306930549, or mute the thread https://github.com/notifications/unsubscribe-auth/AQuScS6gg4ByUvGg_fiuhdL80et9sLaUks5sBxYHgaJpZM4NjGrB .

erikrose commented 7 years ago

Ah. I was hoping the download locations would be analagous, but https://mirrors.aliyun.com/pypi/packages/source/p/pip/pip-8.0.3.tar.gz (which parallels the address in PyPI proper) doesn't exist. pipstrap, in order to be short, simple, and auditable, doesn't implement the Package Index API; it doesn't follow the links and find the package in its actual location on the mirror. Hmm. No simple variable substitution will do here; it will require more thought about tradeoffs. Alternatives I can think of:

  1. Implement the client side of the Package Index API. Requires HTML parsing, etc. If we ripped it off from pip, it would be at least 300 lines. That's twice as long as the entirety of pipstrap now.
  2. Host the 4 files we need somewhere that isn't firewalled off. Leads to a cat-and-mouse game.
  3. Hard-code in the hashed paths that seem to be constant across all available mirrors listed at https://pypi-mirrors.org/.
erikrose commented 7 years ago

I'm curious what they're using to do the mirroring and what it's hashing. The hashes are the right length to be sha256s, but they don't appear to be hashing 'argparse', 'argparse-1.0.zip', or the contents of the file. I've read the source to bandersnatch and pypiserver and found no such hashing.

erikrose commented 7 years ago

Aha, PyPI proper comes up with the paths. The files are available at those hashed paths and at the guessable ones I currently use. So we'll just switch over to hashed ones—which are sure to be the same everywhere—for all cases, and the rest will be solveable through simple substitution of a base URL.

erikrose commented 7 years ago

Thanks to @hannosch for noticing this!