mit-dci / utreexo

accumulator for bitcoin utxo set
MIT License
323 stars 60 forks source link

Safer merkle hashing: commit to row number in hash #183

Open adiabat opened 4 years ago

adiabat commented 4 years ago

Right now the parent hashing function is just sha256(left, right) (where , means they're stuck together)

Many attacks can be prevented by committing to more data in the hash. We can't commit to the leaf position as it changes, but we can commit to the height (row number) of the hash being performed. It shouldn't be a huge code change to make the parent hash function

sha256(left, right, rownum)

only adds an extra byte, that byte is known & easily accessible. The attack I know of (CVE-2017-12842) may not apply here but there could certainly be other attacks like that, and it doesn't hurt anything, so safer to switch this.

kcalvinalvin commented 4 years ago

I did some benchmarking and seems like the performance is the same.

The implemented function was:

// Parent gets you the merkle parent.  So far no committing to height.
// if the left child is zero it should crash...
func parentHashWithRow(l, r Hash, nRow uint8) Hash {
        var empty Hash
        if l == empty || r == empty {
                panic("got an empty leaf here. ")
        }
        toHash := make([]byte, 65)
        copy(toHash[:32], l[:])
        copy(toHash[32:64], r[:])
        toHash[64] = byte(nRow)
        return sha256.Sum256(toHash)
}

Benchmark was:

[N] calvin@bitcoin ~/b/u/g/s/g/m/u/accumulator> go test -run=XXX -bench=BenchmarkParentWithRowHash
goos: linux
goarch: amd64
pkg: github.com/mit-dci/utreexo/accumulator
BenchmarkParentWithRowHash_Thou-4       1000000000               0.000632 ns/op
BenchmarkParentWithRowHash_HunThou-4    1000000000               0.0522 ns/op
BenchmarkParentWithRowHash_Mil-4        1000000000               0.539 ns/op
PASS
ok      github.com/mit-dci/utreexo/accumulator  14.561s
[I] calvin@bitcoin ~/b/u/g/s/g/m/u/accumulator> go test -run=XXX -bench=BenchmarkParentHash
goos: linux
goarch: amd64
pkg: github.com/mit-dci/utreexo/accumulator
BenchmarkParentHash_Thou-4              1000000000               0.000551 ns/op
BenchmarkParentHash_HunThou-4           1000000000               0.0572 ns/op
BenchmarkParentHash_Mil-4               1000000000               0.561 ns/op
PASS
ok      github.com/mit-dci/utreexo/accumulator  17.394s

I think the new function is faster since the slice is pre-allocated. Anyhow I don't see any reason not to add this.

adiabat commented 4 years ago

Yeah I'm not 100% sure on this but I think with sha256, because of padding doing 64 or 65 bytes is the same. Because even though sha256 "eats" 64 bytes at a time, there must be padding saying the total length at the end, so a 64 byte input ends up being a 65 or 66 or something payload to the compression function so it has to run twice anyway.

Which of course leads to the idea -- hey we don't need all 32 bytes right? What if they're 31 byte hashes, that's basically just as good, and might go 2X faster!