erikrose / peep

A "pip install" that is cryptographically guaranteed repeatable
MIT License
221 stars 28 forks source link

integrity of main package? #110

Closed ThomasWaldmann closed 8 years ago

ThomasWaldmann commented 8 years ago

peep checks all the stuff it finds in requirements.txt, but how does it check that the main package is as expected (and thus, that the code in there or the hashes in requirements.txt have not been tampered with)?

so, if I didn't miss something, could it work like this for the main package:

peep install --sha256 <documented_hash_from_safe_source> mainpackage

erikrose commented 8 years ago

I have so far assumed that people have their own ways of trusting their source code. However, given that SHA1 is not very collision resistant anymore and thus git hashes no longer sufficient, it wouldn't be a bad idea to deploy from a zip or PyPI checkout and check against a local SHA256 that you computed before uploading the package. (Note that hashes of directories, like VCS checkouts, are unsupported at the moment: metadata is unstable, and temp files commonly worm their way in, so it's a tricky problem.) For now, you could make a one-package requirements file and run peep twice: once against that and again against the requirements file included in your top-level package (if, in fact, it ships with the sdist, which it typically doesn't).

I'll give this some thought. Peep's functionality is merging into pip 8, so I was already on the track of creating some kind of lightweight bootstrapper to get that whole heavier toolchain going without trusting virtualenv, pip, and setuptools.

Cheers!

ThomasWaldmann commented 8 years ago

Well, not everybody is installing from git. Normal users just do "pip install ..." (or "peep install ...").

BTW, please also consider that hashes are not the best to assure authenticity (esp. if the hash somehow would come from same source as the package). In general one rather needs a digital signature (like gpg sig + a way to trust the signing key). Hashes can be regenerated easily by a 3rd party so they match any tampered stuff. As pip already supports uploading gpg signed packages, that might be something to consider. Both ways could be combined.

anarcat commented 8 years ago

+1 for integrating pip's gpg authentication system in here... if all was needed was the pubkey of the author of the software in requirements.txt, that would be great!

we could even just rely on the web of trust otherwise (or at least try to)...

erikrose commented 8 years ago

@ThomasWaldmann Peep has never been about authenticity. As you say, pip + wheels offers author trust already. But I don't have any interest in trusting authors, along with every possibly buggy release they make in the future. Nor do I trust their key management with Mozilla's production systems. In peep's model, you examine the code, decide if you trust that specific release, and then lock it down with locally generated hashes. It's about repeatability. Signatures solve a different problem. Now, signing your git commits or your top-level package is a good idea. When pip 8 comes out, you'll be able to have your signatures and your hashes too, assuming you ship a requirements file with your top-level package (not common practice at the moment).

erikrose commented 8 years ago

As for hash-verifying the top-level package, I've filed a ticket about it on pip, which is where I think we should solve it: https://github.com/pypa/pip/issues/3257.