cosmos / iavl

Merkleized IAVL+ Tree implementation in Go
Apache License 2.0
423 stars 264 forks source link

Switch the internal hash function to SHA2 from RIPEMD160 #38

Closed zmanian closed 6 years ago

zmanian commented 6 years ago

I'm expecting that the auditors will eventually make the same recommendation.

But I don't think we should go to 1.0 with RIPEMD160 as our hash function.

RIPEMD160 is a bad choice for a few reasons.

  1. The birthday bound on RIPEMD160 is 2^80 for a second preimage. A 2nd preimage would at very least allow for state corruption in our database and possibly allow for forgeries

  2. RIPEMD160 is roughly the same speed/throughput as SHA256 on current hardware. Future generations of Intel and ARM server hardware have will have native SHA256 instructions that allow massive speed ups. Here is how fast SHA156 is on Ryzen: https://crypto.stackexchange.com/questions/50618/how-fast-can-a-sha-256-implementation-go/50620

  3. RIPEMD160 is 10x more expensive to verify on the EVM than sha256. https://github.com/ethereum/go-ethereum/blob/master/params/protocol_params.go#L69-L72

Bikeshed

It's a reasonable choice to ask if we should adopt a modern hash function. The consensus among hash function experts is that SHA2 will never be broken and will likely be the most widely supported hash function for the rest of human existence.

BLAKE2b is almost 5x faster in software than RIPE160MD but doesn't currently have ethereum ecosystem compatibility and is unlikely ever to be implemented in widely available hardware. Tho hardware sha256 is only 2x faster blake2b.

Shake128 is kinda slow. Farfalle is very fast but too new.

Ethereum KECCAK is indefensible.

odeke-em commented 6 years ago

I'll page @veorq too for a security blessing and thoughts.

liamsi commented 6 years ago

I've added #49 which simply changes the occurrences of ripemd160 to sha256.

I was wondering if we either 1) should have a central space where the hasher is defined (in case we want to experiment with other hashers we can simply modify one line) or 2) allow the user to choose or even configure which hasher to use (would make iavl more general purpose but potentially more difficult to use correctly). edit: related #11

ebuchman commented 6 years ago

Do we need to worry about length-extension attacks with SHA256 here ?

ebuchman commented 6 years ago

Also we discussed truncating the SHA256 hash to 20 bytes

veorq commented 6 years ago

Truncation to 20B would eliminate any length-extension risk. Or you can just use SHA-512/256, i.e. the NIST-official SHA-512 truncated to 256 bits. But SHA-256 would be slightly faster, in case performance matters.

zmanian commented 6 years ago

Length extension is irrelevant. Length extension attacks operate in a context where the attacker does not know a secret input to the hash function. Here all the inputs are publick

Merkle trees of in this context don't require any secret inputs.

Do we think it is safe to truncate our hashes to 160 Bytes?

it seems safe to me if we really want to optimize for proof size.

Even if a second preimage is 2^80, the second preimage isn't going to be semantically useful.

mappum commented 6 years ago

Or you can just use SHA-512/256, i.e. the NIST-official SHA-512 truncated to 256 bits. But SHA-256 would be slightly faster, in case performance matters.

Wouldn't SHA-512/256 be faster on 64-bit hardware? (Which I assume most nodes are)

zmanian commented 6 years ago

@mappum SHA-512/256 is faster on 64 bit hardware but Intel SHA extension instructions are becoming more and more widely available which makes SHA256 faster than blake2b and those don't support sha256.

Generally sha256 hardware is becoming widely available and sha512 is not.

This is largely cause most cryptographers believe that sha256 will never be broken.

ebuchman commented 6 years ago

Still unsure if we should use the 256-bit output or truncate to 20-bytes.

Maybe we can benchmark it to see if the difference is worth it, but this might require us to change up the tree structure a bit so the hash function is configurable.

Also longer hashes will mean longer proofs ...

ValarDragon commented 6 years ago

I think we should go with Blake2b, since blake2b is 5x faster than Sha2 in software. (Source: https://blake2.net/blake2.pdf) The Sha256 hardware isn't very widespread right now to my knowledge. (I am only aware of Intel creating them) I don't think we should plan around hardware implementations until they are widespread across multiple hardware vendors, to avoid centralization in any one type of hardware. (To avoid another situation like meltdown).

Regarding blake2b being 2x slower than a Sha2 hardware implementation, does this account for the parallelization used in the blake2bp variant? If we used blake2bp, that would give another 3-4x speed increase I think? Blake2bp just assumes a multicore cpu, or SIMD. This seems much more reasonable to me to optimize for, (since basically all computers have multiple cores, and I foresee that this trend will continue since we've hit the wall for the amount of transistors we can fit per unit space). This should also make the speed rival or even beat SHA2's hardware implementation.

Also can we prioritize moving away from RipeMD to either Sha2 or Blake2? On the profiler for tmbench, 12% of the time was spent on RipeMD hashes. RipeMD definitely shouldn't be here at launch.

zmanian commented 6 years ago

So while I love modern hash functions like blake2 and kangaroo12, they still don't match the ecosystem support for sha2.

Sha2 is fast, secure, widely supported in hardware, supported in environments like EVM etc.

At the union of all concerns, sha2 is difficult to beat. If sha2 hardware wasn't available, faster hash functions would be more interesting. Given imminent hardware, it's hard to justify the tradeoffs with a more modern hash function

ValarDragon commented 6 years ago

You're right. In the future, if there was greater kangaroo12 support, would we consider switching?

Also I was doing some benchmarks in software, and I ran into that Sha512 was faster than sha256 in software for sufficiently large inputs, however Sha256 was faster at small inputs.

openssl speed sha256 sha512
The 'numbers' are in 1000s of bytes per second processed.
type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
sha256           83710.65k   189726.85k   352530.26k   443103.57k   482623.49k
sha512           58343.51k   233275.43k   397977.94k   579351.55k   685501.10k

Since Bucky's comment was that we should truncate the hashes to 20 bytes, I ran the hash functions in golang to test this. I found that Sha256 was faster at 20byte input sizes (by about 25%). I couldn't figure out how to get openSSL's speed tests to use 20byte sizes, so I couldn't run the openSSL benchmarks on 20bytes.

So could we use Sha512/256 on the leaves of the IAVL tree, (since I anticipate them to be larger than 64 bytes, where openssl shows that Sha512 is faster) and use Sha256 on all the leaves in the IAVL tree, since we have that Sha256 is faster at 20 byte inputs? I understand if the answer is no due to complexity, however this should minimize computation time on 64 bit systems with no hardware implementation.

Note I don't know how these speed relations will translate to hardware. These speed differences are due to sha512 using 64 bit words and sha256 using 32 words, and I'm running this on a 64bit machine. I'm not sure how the speed difference will be for hardware implementations. I haven't found specs on sha2 hardware implementation speeds.

The above is an unhelpful point since it appears that there will only be sha256 hardware implementations.

Chjango commented 6 years ago

Andrew Poelstra seems to have a differing view about truncated SHA256. He says, I quote:

The Bitcoin mining network churns through 80 bits every couple weeks (16 days by my estimation). And that's 2^80 double-SHA2 hashes which I expect take more than 3 times the time/power vs blake2. It's expensive, sure, but it's far from impossible and caps our security. There was a mailing list thread a couple years ago about Bitcoin getting burned by using 20-byte hashes. And they have only a theft risk, not a consensus failure risk.

mdyring commented 6 years ago

My $0.02 is that, even though Blake2b is nice, SHA256 enjoys wide support in both software and hardware and seems a sensible choice.

Regarding the truncation, @zmanian hits the nail on the head:

it seems safe to me if we really want to optimize for proof size.

I think for light client proofs, which mainly incur bandwidth costs as they are likely not persisted on disk, size should not be a concern. And in this case I would argue to keep it simple and avoid surprises, so SHA256 without truncation.

It might matter a bit more for IBC, but I don't recall if proofs will actually need to be persisted here? But even if that is the case, here security should go before attempts to reduce storage costs, which are decreasing over time anyway.

To play the devils advocate, it might be healthy to turn the discussion around: Assuming an existing standard SHA256 implementation, what sensible arguments are there for introducing truncation? And if so, why 20 bytes specifically?

zmanian commented 6 years ago

IBC transactions are only persisted for a few blocks in the state and it is a significant motivation of reducing proof size.

20 bytes is generally considered the limit to have collision resistance.

Chjango commented 6 years ago

So what is really alarming to me about using a truncated 20-byte SHA256 hash throughout our entire codebase along with the IAVL tree is that finding a collision is a lot easier once there is a pre-existing set of 500m-1b hashes available. If we assume 2^29 leaves are present (roughly 500m), we've essentially reduced our security to birthday attacks by 30 bits (counting the internal node hashes as well puts it at 1b total hashes to find a collision with). At that point we might as well be using a 128 byte hash.

Jihan alone can find an 80 bit collision today. How will this thing be secure 5 years from now? I do not agree that optimizing for proof size is worth the risk of a systemic consensus failure of our protocol.

CC @ebuchman @Liamsi @zmanian @jaekwon @ValarDragon

zmanian commented 6 years ago

Because we prefix the inner nodes of the tree, the inner nodes don't count.

ValarDragon commented 6 years ago

finding a collision is a lot easier once there is a pre-existing set of 500m-1b hashes available.

How does collision resistance decrease multiplicatively once there are already some hashes computed? The birthday bound is based on the number of hashes computed. Computing hashes only decreases the amount of hashes that remain to be computed additively.

If we assume 2^29 leaves are present (roughly 500m), we've essentially reduced our security to birthday attacks by 30 bits (counting the internal node hashes as well puts it at 1b total hashes to find a collision with).

This isn't true. The birthday collision says that the probability of getting a collision once you've computed 2^29 hashes is (2^29)*(2^29 - 1)*(2^{-1}) * (2^{-256}). You still have to compute ~ 2^128 - 2^29 hashes to get to the 50% probability. (Number is a bit off by a factor of sqrt(3)/2 iirc). The 2^29 you've already computed don't make a significant difference. If this were true, then since every computation of 2^29 hashes reduces security by 30 bits, you would only have to compute 9*(2^29) = 2^32 hashes to break full sha256.

That being said, I generally agree with increasing the 20 byte mark. I've made an issue in tendermint, https://github.com/tendermint/tendermint/issues/1990. We should do a cost analysis to determine what the best size will be. Theres a whole spectrum between 20 bytes and 32 bytes.

liamsi commented 6 years ago

I'm still not convinced that this poses an intermediate risk. That said, I would always favour security (especially in this context) over saving a few bytes. So, I tend to agree with @Chjango here.

Let's see what @ValarDragon's analysis in tendermint/tendermint#1990 will reveal and take it from there.