Closed l0kod closed 9 years ago
Following the RubyGems.org compromission, there is currently some discussions about protecting Python package downloads: Surviving a Compromise of PyPI (PEP 458 and PEP 480).
@l0kod what you're describing sounds a lot like you're trying to reinvent TUF, and PEP 458/PEP 480 is implementation of TUF.
Specifically TUF packages can be "unclaimed" and signed by an online key held by the packaging system itself, or "claimed" and signed by the author(s) of the packages (using a threshold signature scheme)
If you don't intend to implement TUF wholesale, I'd suggest reading their paper, sketching out a threat model, designing a usability story, and coming up with something a bit more concrete:
I'll leave a note on this specifically:
The hard step: only allow crates with signed Git tags (from their owner)
The problem with using Git tags (beyond the actual mechanics of how you intend to use them) is they rely on the SHA1 hash function for integrity, as this is what's used for Git's Merkleized DAG.
SHA1 is already known to be problematic, and SHA256+ is recommended for new projects.
If a second preimage attack were found against SHA1, it would be catastrophic for anything using git tags in this capacity.
Now granted that a second preimage attack is yet to be found against MD5 which is known to be much more broken than SHA1 (it is trivial to collide MD5 using chosen prefixes). However, if we don't rely on git for computing digests, we can pick a much stronger hash function.
I'd also note that Linus does not consider Git's usage of SHA1 to be a "security feature", so it's questionable as to whether Git will end up using a stronger hash function in the future:
Nobody has been able to break SHA-1, but the point is the SHA-1, as far as Git is concerned, isn't even a security feature. It's purely a consistency check. The security parts are elsewhere, so a lot of people assume that since Git uses SHA-1 and SHA-1 is used for cryptographically secure stuff, they think that, OK, it's a huge security feature. It has nothing at all to do with security
-- Linus Torvalds
I'm going to close this in favor of https://github.com/rust-lang/crates.io/issues/75 in light of @tarcieri's comment. I also wouldn't mind creating a more general "securing crates.io" issue, but it sounds like https://github.com/rust-lang/crates.io/issues/75 is quickly becoming accepted practice and may be the best route for that regardless.
As an aside, I'd advocate using a memory hard hash function like Argon2 for tools like git. If quantum computers are possible, then Grovers algorithm half's your security parameter. Just because you adopt 256+ bit hashes doesn't mean people use all the bits when discussing commits. Argon2 should protect against Grover unless adversary builds a quantum computer with millions or billions of qubits, assuming no other quantum algorithm attacks it specifically.
@burdges that doesn't make any sense. Argon2 is a password hashing algorithm. Its construction is intended to mitigate time-memory tradeoffs when brute forcing low-entropy passwords. This is completely irrelevant for selecting the hash function for a content addressable storage system like git.
For a hash that tolerates Grover and targets a 128-bit security level, we:
This puts us at a 512-bit hash function. SHA2-512 should be sufficient.
Also it sounds like you're proposing a change to git. You should probably take that up with the git maintainers.
Ain't so likely anyone will change git soon, but if one considers reinforcing it by keeping records in the package manager, then one needs a better hash function. Argon2 is susceptible to Grover like everything, but it'd still protect against time-memory trade offs when brute forcing higher-entropy data. It follow that Grover cannot be used without a ridiculously large quantum computer. That'd double the security of the shorter substrings that humans sometimes handle. I suspect it's easier to protect against time-memory trade offs in a quantum computer, so maybe you don't need anything as fancy as Argon2, but I haven't thought about it carefully.
Argon2 is susceptible to Grover like everything, but it'd still protect against time-memory trade offs when brute forcing higher-entropy data
A 128-bit post-quantum security level is outside the realm of brute force attacks. SHA2-512 is sufficient.
The NSA recently updated Suite B to harden it against potential future attacks by quantum computers. This is their recommendation for a hash function:
https://www.nsa.gov/ia/programs/suiteb_cryptography/index.shtml
Use SHA-384 to protect up to TOP SECRET.
Instead of Argon2, I think BLAKE2b would be a compelling alternative, but as @tarcieri notes, SHA2-384 and SHA2-512 are sufficient.
+1 for BLAKE2b-512. It is also a good candidate and performs very well in software.
Okay, this ticket is closed but I couldn't help myself from bike shedding on hashing algos. First of all, if we used Argon2 we would burn the sh!t out of our laptops because we would never finish hashing the files 😱!
BLAKE2 is a good replacement for SHA1, but we really shouldn't put all of our eggs in one basket as BLAKE2 has a ~112 bit security level and is too similar to SHA2. KangarooTwelve is looking like a good partner for BLAKE2 as it's very fast and very different from SHA2.
There's really no reason to use off-the-rails hash functions (despite my +1 to BLAKE2 above).
SHA(2)-256 is used almost universally due to widespread hardware support, including but not limited to AVX2 implementations (not used in the BLAKE2 benchmark chart or benchmarks showing SHA-512 is faster than SHA-256, for example, but supporting CPUs dating back to 2008) as well as the new SHA Extensions and AVX-512 introduced in e.g. Skylake. SHA-256 is also pretty much going to be the only hash function you'll commonly find in hardware on MCUs.
Keccak-based hash functions have an internal state of 1600-bits, making them much less amenable to e.g. AVX-512 optimizations (which can run 4 SHA-256 rounds in parallel using simultaneous hashing), though yes KangarooTwelve looks kind of neat here. The use of a sponge function makes them nicely distinct from the SHA2 family, however.
If you want a Keccak-based alternative to SHA2-256, there's little reason not to use SHA3-256. This is the only Keccak-based construction likely to be implemented in hardware at some point in the future (aside from the other SHA3 variants and possibly the SHAKEs), and also the one CPU manufacturers will be targeting for optimized software implementations.
There is also little reason to suspect quantum computers will be able to attack either SHA2-256 or SHA3-256, despite the NSA recommending a move to SHA2-384+. In a post-quantum scenario, both will still have 128-bits of preimage resistance and quantum attacks are unlikely to improve the performance of collision-finding.
This is the only Keccak-based construction likely to be implemented in hardware at some point in the future
at this point It would make more sense to implement keccak-f in hardware instead.
There is no signature verification to authenticate the crates.io database. An attacker who compromise the crate.io repository/server/Github account (as happened to kernel.org) could spread the infection to every Rust developer and application user.
Here are some complementary solutions:
The @bors' public key should be shipped with Cargo, which should be signed with a Rust core developer… Contrary to Rust, the Cargo repository isn't actually signed at all!