Closed mike76-dev closed 1 year ago
Ambitious! I would definitely like to see some hard numbers here. If this speeds up IBD by like, 10%, then it's a no-brainer. If it's more like 3-5%, I'd be on the fence. Below that, not worth it. Ideally you would test this by syncing all the way from genesis to tip (and ideally, multiple times!), but if that's too onerous perhaps you could start with just the first 50,000 blocks. You'll also want to sync from another locally-running node, to control for peer quality.
Unfortunately, I don't think the block ID cache is viable. Using the nonce as a key is interesting, but with only 64 bits of entropy, even accidental collisions are too likely, never mind intentional ones -- and consensus code absolutely cannot ever be confused about which block has which ID.
Well, having tried numerous cache configurations and run multiple tests, I can only state that the win in the syncing time is quite marginal and can well be within the experimental error. First of all, the caches seem to only be efficient on a slow hardware. On an SSD and a decent CPU, caches can even slow syncing down. It looks like the reads can be accelerated by using the cache, but then the writes become the bottleneck, and while that could theoretically be solved, I've already spent too much time on this and can't afford spending any more. So, I'll close this PR out for now. Regarding the block ID cache, I also had some doubts on that but wanted to hear a professional opinion first.
Ah, that's disappointing. Thanks for taking the time to explore this, though! I would definitely like to do some aggressive benchmarking+optimization of the core
syncing process, so this result is helpful.
I implemented a few caches in the consensus set, which should speed up the initial syncing. These caches include: