lbryio / lbrycrd

The blockchain that provides the digital content namespace for the LBRY protocol
https://lbry.com
MIT License
2.58k stars 178 forks source link

customize the wallet flush #358

Closed BrannonKing closed 4 years ago

BrannonKing commented 4 years ago

Add parameters to change the wallet flush rate. Separate out the disk sync from the lsn_reset.

Questions:

  1. Is there a newer version of BerkDB that doesn't rewrite the entire DB file when lsn_reset is called?
  2. Or can we put it in a "don't need lsn_reset" mode?
  3. Should we call the vacuum command at some point to keep the file small?
  4. We could drop BerkDB and ship a copy of db_dump to migrate the old data into an Sqlite DB. Is there some kind of obfuscation out of the box?
BrannonKing commented 4 years ago

BerkDB does obfuscate the file out of the box, but lbrycrd/bitcoin runs lsn_reset every time it flushes, which effectively resets/disables the machine-specific tie.

tiger5226 commented 4 years ago

@BrannonKing I think this is the real solution https://stackoverflow.com/a/23475187/2038176

DbEnv::lsn_reset() is probably not what you want. That function rewrites every single page in the database, so that you can close the databases out and open them in a different environment. It's going to write out at least 2.1 GiB, and pretty slowly.

If you're just shutting the application down to be started back up sometime later, you may simply just want to do a DbEnv::txn_checkpoint() to flush the database log and insert a checkpoint record. Though, this isn't required either. As long as you have the logs committed to stable storage, you can simply exit your application.

I think that call is just really overzealous. As long as we have the log files we don't need to rewrite the whole database with an lsn_reset.

Berkeley DB has consistency checks that may be triggered if an application calls DB_ENV->lsn_reset() on a database in a new environment when the database LSNs still reflect the old environment.

So the moral to the story is don't allow lsn_reset when LSNs don't match. lsn_reset should probably only be called when the dump wallet command is called, so that it can be moved.

tiger5226 commented 4 years ago

Also I would probably never switch from BDB to SQLite. SQLite is not really meant for large amounts of data whereas BDB is. https://www.quora.com/What-is-the-difference-between-SQLite-and-Berkeley-DB

BrannonKing commented 4 years ago

As I understand it, if we remove lsn_reset from the flush and you have a hardware failure then that wallet will be lost forever. Is that acceptable? Can we set up some other kind of wallet export & backup that runs once an hour?

tiger5226 commented 4 years ago

If you review the documentation that is not what it means at all. Thats what someone said on the bitcoin forums. The lsn stands for log sequence numbers, and it uses that to know what additions there are for the for the file, to exactly avoid our situation. People use BDB to store terabytes of data, if they called lsn_reset it would take forever. You call lsn_reset when you want to move the file and not take the log files with it. We can also add some preventative code to avoid making changes with the LSNs dont match. The important thing to do is to make sure there are no uncommitted transactions, but that takes microseconds.

Feel free to do your own research on it but from mine, I don't think that statement is true.

kauffj commented 4 years ago

@BrannonKing, @tiger5226 seems relatively confident this lsn_reset change would solve this issue and not affect existing behavior. What is the work load to either make this change or ship a version with a flag so it can be tested?

If we're uncertain about safety, it seems fine to run a test build in production on internal APIs before properly releasing.

bvbfan commented 4 years ago

Possible solution in #360 I research a bit more about lsn_reset, it actually randomise the position of private key bdb might keep bits of the unencrypted private key in slack space in the database file as we can see in wallet.cpp. In other hand txn_checkpoint is need to be called at periodic time to ensure all changes since last checkpoint are flushed, in case of crash unflushed data will be lost, that's pretty unwanted behaviour at wallet side and it can cause transaction loss. In this case -flushwallet looks pretty dangerous parameter since it'll skip txn_checkpoint call.

BrannonKing commented 4 years ago

Fixed in 17.4.1