btcsuite / btcd

An alternative full node bitcoin implementation written in Go (golang)
https://github.com/btcsuite/btcd/blob/master/README.md
ISC License
6.11k stars 2.32k forks source link

chainhash, wire, btcutil, main: Memory efficient txhash #1978

Closed kcalvinalvin closed 6 months ago

kcalvinalvin commented 1 year ago

When profiling the ibd, I noticed that TxHash() is allocating quite a bit of memory.

IMG_0153

I also noticed this with utreexod in the past and I resolved this by creating a new double hash function. I've benchmarked it and made is more efficient in this PR.

Benchstat for BenchmarkTxHash() showed a slight bit of increase in sec/op which is likely due to the overhead of serializing into a hash.Hash instead of the bytes.Buffer in the previous TxHash(). However, for real life cases, the new TxHash() will be faster as we're saving on the unnecessary serialization into bytes.Buffer as sha256.Sum256()will call Write() into the hash anyways.

The memory allocation savings is ~37% compared to the old TxHash(). This matters a lot because TxHash() gets called a lot.

goos: linux
goarch: amd64
pkg: github.com/btcsuite/btcd/wire
cpu: AMD Ryzen 5 3600 6-Core Processor              
          │   old.txt   │              new.txt               │
          │   sec/op    │   sec/op     vs base               │
TxHash-10   1.347µ ± 1%   1.411µ ± 2%  +4.79% (p=0.000 n=10)

          │  old.txt   │              new.txt               │
          │    B/op    │    B/op     vs base                │
TxHash-10   256.0 ± 0%   160.0 ± 0%  -37.50% (p=0.000 n=10)

          │  old.txt   │            new.txt             │
          │ allocs/op  │ allocs/op   vs base            │
TxHash-10   2.000 ± 0%   2.000 ± 0%  ~ (p=1.000 n=10) ¹
¹ all samples are equal

For bigger transactions, the memory savings is even greater. I performed the same test but with multiTx instead of genesisCoinbaseTx.

diff --git a/wire/bench_test.go b/wire/bench_test.go
index b9f18b83..e4cb9c88 100644
--- a/wire/bench_test.go
+++ b/wire/bench_test.go
@@ -598,7 +598,7 @@ func BenchmarkDecodeMerkleBlock(b *testing.B) {
 // transaction.
 func BenchmarkTxHash(b *testing.B) {
        for i := 0; i < b.N; i++ {
-               genesisCoinbaseTx.TxHash()
+               multiTx.TxHash()
        }
 }

Below is the benchstat for that test. We see less of a penalty for speed and see ~41% savings in memory allocated.

goos: linux
goarch: amd64
pkg: github.com/btcsuite/btcd/wire
cpu: AMD Ryzen 5 3600 6-Core Processor              
          │ old-multitx.txt │          new-multitx.txt           │
          │     sec/op      │   sec/op     vs base               │
TxHash-10       1.507µ ± 0%   1.543µ ± 2%  +2.39% (p=0.000 n=10)

          │ old-multitx.txt │          new-multitx.txt           │
          │      B/op       │    B/op     vs base                │
TxHash-10        272.0 ± 0%   160.0 ± 0%  -41.18% (p=0.000 n=10)

          │ old-multitx.txt │        new-multitx.txt         │
          │    allocs/op    │ allocs/op   vs base            │
TxHash-10        2.000 ± 0%   2.000 ± 0%  ~ (p=1.000 n=10) ¹
¹ all samples are equal

The benchmarks for BenchmarkDoubleHash* show a significant speedup for DoubleHashRaw(). This is likely because the other hashes have to call digest.Write() in sha256.Sum256() while the DoubleHashRaw() function gets to skip that.

[I] calvin@bitcoin ~/b/g/u/g/s/g/b/b/wire (memory-efficient-txhash) [1]> go test -run=XXX -bench=BenchmarkDoubleHash -benchmem
goos: linux
goarch: amd64
pkg: github.com/btcsuite/btcd/wire
cpu: AMD Ryzen 5 3600 6-Core Processor              
BenchmarkDoubleHashB-10          1581974               758.1 ns/op            32 B/op          1 allocs/op
BenchmarkDoubleHashH-10          1600311               750.5 ns/op             0 B/op          0 allocs/op
BenchmarkDoubleHashRaw-10        3185388               370.1 ns/op            32 B/op          1 allocs/op
PASS

I can also post some before and after ibd profiling if requested.

coveralls commented 1 year ago

Pull Request Test Coverage Report for Build 7197319343


Files with Coverage Reduction New Missed Lines %
peer/peer.go 1 74.52%
<!-- Total: 1 -->
Totals Coverage Status
Change from base Build 7168733027: 0.0%
Covered Lines: 27977
Relevant Lines: 49897

💛 - Coveralls
kcalvinalvin commented 1 year ago

The last 2 force pushes change the function signature of DoubleHashRaw to accept a closure that returns an hash. This is done as I found that the

buf := make([]byte, 0, 32)

escapes to the heap with how it was implemented in the previous commits. It ended up allocating less memory than the current master but did allocate 2 objects per hash instead of 1. This change allows the compiler to keep the buf on the stack and only do 1 allocation instead of 2.

Roasbeef commented 8 months ago

I think this just needs a rebase!

kcalvinalvin commented 8 months ago

Rebased

For ibd and block verification, I think https://github.com/btcsuite/btcd/pull/2023 is going to be better but that one's still in the works.

kcalvinalvin commented 6 months ago

Removed commits that were changing the go.mod files

Roasbeef commented 6 months ago

Will tag the new version, then make a new PR updating relevant go.mods, before a final PR that swaps this in everywhere.