nimiq / core-rs-albatross

Rust implementation of the Albatross protocol
https://nimiq.com
Other
160 stars 61 forks source link

History store performance improvements #2692

Closed paberr closed 2 months ago

paberr commented 3 months ago

What's in this pull request?

Optionally:

⚠️ This PR changes the database layout. Restarting your client with an existing database will create issues!

Pull request checklist

paberr commented 3 months ago

Statistics

Here are some statistics for the improvements of this PR (gathered on a Macbook Pro with M1 CPU). All statistics are for 720 batches, 700 tpb, rounds: 360, total_txns: 30240000

Pushing one epoch

Commit With Indexing (time) No Indexing (time) With Indexing (space) No Indexing (space)
Prior to PR 1259.94s 1083.88s 13017.30Mb 9677.65Mb
Improved DB layout but no DUP keys (commit 1ab89269907ca9348ed367b4014a2281acc3d511) 162.72s 116.79s 8902.33Mb 6471.89Mb
DUP keys (commit ac390f04d02866963048fd31c1fa8230b5d25ed9) 198.85s 146.07s 7714.10Mb 5283.63Mb

Push and commit time per two batches

When looking at the times per "round", we can see that the old performance deteriorates quickly. Performance per two batches

Pushing multiple epochs

Improved DB layout but no DUP keys (commit 1ab89269907ca9348ed367b4014a2281acc3d511)

Epoch No Indexing (time) No Indexing (space)
1 116.79s 6471.89Mb
2 131.24s 12597.94Mb
3 142.02s 18724.01Mb

DUP keys (commit ac390f04d02866963048fd31c1fa8230b5d25ed9)

Epoch No Indexing (time) No Indexing (space)
1 146.07s 5283.63Mb
2 158.99s 10225.80Mb
3 166.11s 15167.93Mb

Pruning one epoch

Commit No Indexing
Prior to PR > 2h*
Improved DB layout but no DUP keys (commit 1ab89269907ca9348ed367b4014a2281acc3d511) 27.50s
DUP keys (commit ac390f04d02866963048fd31c1fa8230b5d25ed9) 0.28s

*aborted after passing time of push/commit DUP keys already get us a >25,000x improvement

Rationale for choosing DUP

While non-DUP has faster insertion times (roughly ~30s faster than DUP), the pruning is roughly ~30s slower. Since insertion times are usually distributed over an epoch (12h) except for the sync, we opted for the DUP approach, which gives us fast pruning times after an epoch. The sync time still improved by an order of magnitude and the DUP approach also results in smaller database sizes.

paberr commented 3 months ago

Here's a graph of 36 epochs with indexing on a different machine. The final database size for this is 270851.91Mb. Time per Epoch We can see that the insertion/commit time stays almost constant.

sisou commented 3 months ago

Looks like 370-380 seconds per epoch, which is 6.5 minutes. One epoch being 12 hours = 720 minutes, this is a factor of 110. Sounds fine. Plus, this is for full epochs, which we'll likely won't have, so real write will be much faster (per epoch).

Whats a realistic TPB? And how fast would it be then?

viquezclaudio commented 3 months ago

Also, I think it is worth favoring DUP keys to get fast prune times, which are necessary for full nodes

viquezclaudio commented 3 months ago

I think we should change the name of this PR =)