Implement "hook" support for package signature verification.

nejucomo commented 11 years ago

Synopsis

Some people want package signature verification during their pip installs. Other people think relying on authenticated package repository connections (such as over TLS) is sufficient for their needs.

Of those who want package signature verification, there is disagreement about how to tell PIP which signatures to trust (and how users will manage package signing public keys).

Rationale

The rationale for this ticket is to provide a mechanism in mainline pip for signature verification enthusiasts to experiment with different approaches. If a particular approach becomes popular, pip could consider incorporating that particular approach.

In the meantime, rather than have endless committee-style-arguments about how to do package verification, we should have a system that lets users choose for themselves, but only if they opt in.

Also, it keeps package verification cleanly separate from the pip codebase.

Criteria

This ticket may be marked as wontfix, or some other status to indicate that the pip developers reject this proposal.

This ticket may be marked closed, only when these conditions are met:

A user can configure an arbitrary "hook" to process downloaded packages prior to proceeding.
- By default, this hook is configured to nothing, and pip with this feature behaves like pip without this feature.
- A hook provides an interface which:
- takes two inputs:
  1. a path to a local file, which is a newly downloaded package
  2. a URL which the file was retrieved from
- and returns a single boolean indicating "accept/reject".
- When a user has a hook configured, it is invoked on every downloaded package file before any other processing. If it returns true, pip proceeds as normal. If it returns false, pip logs an error and exits.

Implementation Details

I prefer a hook api where the config specifies a path to an executable in pip's config file. The inputs are passed as commandline arguments to a subprocess which invokes that command. The hook's stdout & stderr are the same as the parent pip process. The exit status is 0 to indicate "accept package" and non-zero to indicate "reject package".

-but I'd be happy with any system that fulfills the Criteria above.

Related Issues

Note, there is a less-well-specified ticket #425. I made this ticket because the vagueness of that ticket makes it difficult to close. (Is #425 satisfied by TLS authentication to package repositories based on a standard OS or user trust root? Does it imply or require package signature verification?)

rhuddleston commented 6 years ago

gpg checks are much better than checksums. It tells me that the person who published this package continues to be the same person that published it, since when I first started using the package. Even if just the minority of people downloaded the keys via a different channels and only installed packages correctly signed with specific signatures, those minority (e.g. security professionals) would notice when popular packages keys changes and would investigate. Basically it's a second factor.

For example if I managed to login into someone's pypi account now for a popular package I could now replace this package with my slightly modified version that includes an extra backdoor or malware no one would easily notice.

The thinking that GPG is not perfect so it's better to have nothing is misguided. For example that is what happened here https://github.com/conda/conda/issues/1395

It seems there has been discussion about including TUF since 2011 yet is there any imminent plans of when this might be released? If there is a commitment to implement TUF are there any major roadblocks to getting this done?

Also another thing with TUF is it's not easy for many people to understand. The only place I've seen TUF implemented is with notary and docker. For example all "official" repos on dockerhub are signed and anyone can use use DOCKER_CONTENT_TRUST=1 to pull and validate these images. On the other hand almost no one else signs images when uploading to dockerhub. I think a big part of this is because it's a lot more complicated that gpg to setup and use. I'm hoping to create some easy step by step articles to make this easier so more people will use it.

I worry that even if TUF is implemented on pypi that if it's too difficult for developers to use then no one will bother. If I build a perfect security system and then no-one uses it then overall we will not be better off. For example If we can get a majority of maintainers to add gpg signatures that will be a big improvement. If you required gpg or TUF before someone is allowed to publish to pypi this could be a big advantage.

When TUF is implemented on pypi will this be a requirement? How can we make sure this is a simple process so developers won't find it to be a burden? Basically it will need to be easier than dockerhub if that is the indication on adoption rate.

ncoghlan commented 6 years ago

On 6 October 2017 at 05:12, rhuddleston notifications@github.com wrote:

gpg checks are much better than checksums. It tells me that the person who published this package continues to be the same person that published it, since when I first started using the package. Even if just the minority of people downloaded the keys via a different channels and only installed packages correctly signed with specific signatures, those minority (e.g. security professionals) would notice when popular packages keys changes and would investigate. Basically it's a second factor.

For example if I managed to login into someone's pypi account now for a popular package I could now replace this package with my slightly modified version that includes an extra backdoor or malware no one would easily notice.

And if we were to support distributing signing keys through PyPI, then you'd just post a "Hey, I lost the old key, here's my new key" notice.

In both cases, the real world defence is the same: the publisher notices that someone that wasn't them made a new release, alerts the PyPI admins, the malicious package gets taken down, and a security notification gets issued by the PSRT. (The prohibition on replacing existing versions with artifacts that have a different hash already prevents silent replacement of previous releases)

Adding 2FA support so publisher's accounts are harder to compromise in the first place will also be a useful hardening mechanism, and is one of the items high on the post-Warehouse-migration todo list.

The thinking that GPG is not perfect so it's better to have nothing is misguided.

No, that's not the thinking. The thinking is that the folks that claim to see meaningful value in this approach are refusing to invest the time & energy to build it themselves (or pay a vendor to build & operate it for them), so they clearly don't actually care all that much - they just want to claim that someone else is preventing them for doing that work, rather than closely examining their real reasons for not wanting to build it.

Remember, the PyPA devs and PyPI administrators are potentially part of your threat model when it comes to end-to-end signing, so it's necessary to make sure that a single rogue admin or developer can't easily compromise whatever you decide to set up. That's a lot easier to do if the security mechanism is independent of the services we operate and the tools we provide. (This is in stark contrast to the Linux distro model, where GPG signatures are used to protect the distro->user link, nothing more, with each distro relying on its own internal mechanisms to ensure that what gets signed is what they intended to publish).

For example that is what happened here conda/conda#1395 https://github.com/conda/conda/issues/1395

Making misleading security claims is indeed worse than not making any security claims at all.

It seems there has been discussion about including TUF since 2011 yet is there any imminent plans of when this might be released? If there is a commitment to implement TUF are there any major roadblocks to getting this done?

I last summarised the major roadblocks to that here: http://www.curiousefficiency.org/posts/2016/09/python-packaging-ecosystem.html#making-pypi-security-independent-of-ssl-tls

Mozilla have now provided funding for the legacy PyPI shutdown effort through a Foundational MOSS grant, and the PSF staff will also be actively working on more sustainable operational funding models for PyPI as part of that effort.

Also another thing with TUF is it's not easy for many people to understand. The only place I've seen TUF implemented is with notary and docker. For example all "official" repos on dockerhub are signed and anyone can use use DOCKER_CONTENT_TRUST=1 to pull and validate these images. On the other hand almost no one else signs images when uploading to dockerhub. I think a big part of this is because it's a lot more complicated that gpg to setup and use. I'm hoping to create some easy step by step articles to make this easier so more people will use it.

GPG is also incredibly complicated to set up and use, especially on Windows.

In most cases though, publishers aren't bothering with content signing on DockerHub for the same reason they don't bother with it on PyPI: because they don't personally care, and nobody else cares enough about it to pay them to care.

As a result, the only places we've seen signing work in practice is when there's a trusted intermediary acting as a signing authority (e.g. Linux distros, verified images on container image registries, Microsoft backed driver & installer signatures for Windows, the various mobile and desktop app stores). Even HTTPS relies on that trusted intermediary approach (by way of browser and operating system CA certificate bundles).

I worry that even if TUF is implemented on pypi that if it's too difficult for developers to use then no one will bother. If I build a perfect security system and then no-one uses it then overall we will not be better off. For example If we can get a majority of maintainers to add gpg signatures that will be a big improvement. If you required gpg or TUF before someone is allowed to publish to pypi this could be a big advantage.

Have you even read PEPs 458 and 480? The challenge of publisher adoption is discussed explicitly, and is in fact the entire reason there are two distinct PEPs (since we expect real-world adoption of the publisher dependent end-to-end variant described in PEP 480 to be so close to zero as to be barely worth implementing)

pypa / pip

Implement "hook" support for package signature verification. #1035