zcash / zcash

Zcash - Internet Money
https://z.cash/
Other
4.93k stars 2.04k forks source link

UX: enabling lightwalletd index (indices) makes initial sync (IBD) very slow #6904

Open LarryRuane opened 3 months ago

LarryRuane commented 3 months ago

TL;DR

Enabling lightwalletd in the zcashd configuration causes the initial sync to become almost impractical. It would probably be better for anyone needing this configuration to copy an existing data directory. (I do have such a copy, if anyone needs it and trusts me.)

Also, set dbcache as large as possible when running IBD.

Details:

I lost my data directory and had to start zcashd syncing from scratch (IBD). Progress was extremely slow, like this would take weeks. The cause turned out to be that my zcashd configuration specified lightwalletd=1, which enables several indices. I suspect the same performance problem exists with the insightexplorer=1 config option. This issue is to document what I saw.

I observed that the adding of blocks (UpdateTip messages added to the log) would run fairly quickly, and then the zcashd process would pause for sometimes minutes, nothing seemed to be happening. Then the block download would resume and run normally for a while. These pauses make the overall progress extremely slow. I ran zcashd from the debugger (gdb), and interrupted it during one of these pauses, and the stack trace shows that one of the lightwalletd indices is being written (or txindex, which I had enabled too). This write (flush, actually) happens down within LevelDB, so I have no idea why it's taking so long. This single write can take minutes. During this time, cs_main is held, which isn't a bug, but makes the node unresponsive to most RPCs, and you can also notice that the time since startup in the metrics display is frozen.

This is a separate observation, but I also discovered that even without lightwalletd being enabled, the IBD is vastly faster with a larger (than default) dbcache setting. With the default setting (450, units are MB), it took 6.5 hours to reach height 283k. With dbcache increased to 8000, it reached that height in 42 minutes. It's always better to increase the dbcache setting if you're not memory-constrained, with or without enabled lightwalletd.

I was curious if enabling indices on bitcoind caused this same degree of slowdown. Adding txindex to bitcoind's configuration increased its reindex time by a factor of 3, so that's pretty significant. (Reindexing is a subset of IBD.) But it's still far less than the slowdown we see in zcashd when enabling lightwalletd. It could be because lightwalletd enables multiple indices, and they are written more often (for each transaction input and output, rather than just once per transaction as with txindex). Or maybe we're not writing to these indices in the most efficient way. Both zcashd and bitcoind are using the same release of LevelDB.

I'm opening this as an issue because there may be ways to speed this up, for example (as mentioned) writing to LevelDB more efficiently, or upgrading our LevelDB version (if there have been any performance improvements). It might also be worth verifying that lightwalletd and the light wallets themselves need all of the indices that we're storing; perhaps some can be eliminated.

dismad commented 1 month ago

A suggested dbcache size for particular memory configs would be helpful!