Closed paberr closed 2 months ago
Here are some statistics for the improvements of this PR (gathered on a Macbook Pro with M1 CPU). All statistics are for 720 batches, 700 tpb, rounds: 360, total_txns: 30240000
Commit | With Indexing (time) | No Indexing (time) | With Indexing (space) | No Indexing (space) |
---|---|---|---|---|
Prior to PR | 1259.94s | 1083.88s | 13017.30Mb | 9677.65Mb |
Improved DB layout but no DUP keys (commit 1ab89269907ca9348ed367b4014a2281acc3d511) | 162.72s | 116.79s | 8902.33Mb | 6471.89Mb |
DUP keys (commit ac390f04d02866963048fd31c1fa8230b5d25ed9) | 198.85s | 146.07s | 7714.10Mb | 5283.63Mb |
When looking at the times per "round", we can see that the old performance deteriorates quickly.
Epoch | No Indexing (time) | No Indexing (space) |
---|---|---|
1 | 116.79s | 6471.89Mb |
2 | 131.24s | 12597.94Mb |
3 | 142.02s | 18724.01Mb |
Epoch | No Indexing (time) | No Indexing (space) |
---|---|---|
1 | 146.07s | 5283.63Mb |
2 | 158.99s | 10225.80Mb |
3 | 166.11s | 15167.93Mb |
Commit | No Indexing |
---|---|
Prior to PR | > 2h* |
Improved DB layout but no DUP keys (commit 1ab89269907ca9348ed367b4014a2281acc3d511) | 27.50s |
DUP keys (commit ac390f04d02866963048fd31c1fa8230b5d25ed9) | 0.28s |
*aborted after passing time of push/commit DUP keys already get us a >25,000x improvement
While non-DUP has faster insertion times (roughly ~30s faster than DUP), the pruning is roughly ~30s slower. Since insertion times are usually distributed over an epoch (12h) except for the sync, we opted for the DUP approach, which gives us fast pruning times after an epoch. The sync time still improved by an order of magnitude and the DUP approach also results in smaller database sizes.
Here's a graph of 36 epochs with indexing on a different machine. The final database size for this is 270851.91Mb. We can see that the insertion/commit time stays almost constant.
Looks like 370-380 seconds per epoch, which is 6.5 minutes. One epoch being 12 hours = 720 minutes, this is a factor of 110. Sounds fine. Plus, this is for full epochs, which we'll likely won't have, so real write will be much faster (per epoch).
Whats a realistic TPB? And how fast would it be then?
Also, I think it is worth favoring DUP keys to get fast prune times, which are necessary for full nodes
I think we should change the name of this PR =)
What's in this pull request?
Vec
return values byOption
since transaction hashes are assumed to be uniqueValidityStore
can always enforce the full validity window (fixes #2688)Blockchain
configurableOptionally:
LightHistoryStore
if no longer neededPull request checklist
clippy
andrustfmt
warnings.