btcsuite / btcd

An alternative full node bitcoin implementation written in Go (golang)
https://github.com/btcsuite/btcd/blob/master/README.md
ISC License
6.19k stars 2.34k forks source link

[database] Remove Db.DropAfterBlockBySha and Db.RollbackClose #257

Closed davecgh closed 8 years ago

davecgh commented 9 years ago

Issue by jongillham Friday Oct 24, 2014 at 10:57 GMT Originally opened as https://github.com/btcsuite/btcdb/issues/9


I would like to remove Db.DropAfterBlockBySha and Db.RollbackClose as it prohibits several NoSQL datastores from using btcd due to their lax transaction/atomicity guarantees and no concept of rollback.

I just wanted to start a discussion on how feasible this would be and what the implications are before I dive in and break everything.

davecgh commented 9 years ago

Comment by davecgh Friday Oct 24, 2014 at 17:40 GMT


So, ultimately, I think this is the right idea and it ties in with #6. However, it is going to be a lot of work to get there. The issue is that right now, btcdb guarantees that it always represents the main (best) chain. As a result, much of the code in various places such as btcchain, which uses this guarantee to build its logic on top of it for reorgs, and btcd which frequently queries NewestSha to determine the best chain tip.

Ultimately, and it is discussed in #6, I think it would be better to modify things such that the db is not quite as smart about what it's storing. Then everything should be querying btcchain for the latest best chain.

There are several things that need to happen though for this to become a reality:

1) btcchain needs to be made safe for concurrent access. Currently it's not and everything chain-related is piped through the BlockManager in btcd while everything queries btcdb for the tip (because it is concurrent safe). 2) btcchain can no longer use the fact that the database is always the best chain which means it needs logic to handle this fact. Since that is consensus code, it has to be painstakingly accurate. Any mistake there would mean a fork. As a result, this is touching on what is quite possibly the most important thing to get right in the entirety of the suite and is super non-trivial work. 3) Since the database always currently guarantees it has a view of the best chain, it always means all of the spend information is accurate as of the current view of the best chain. This means when reorgs happen and blocks are dropped, the spentness in the database follows. Removing Db.DropAfterBlockBySha has some extremely significant implications in regards to this 4) There is at least one utility which currently relies on Db.DropAfterBlockBySha that would need to be updated accordingly. This might be challenging since it's entire purpose is to put the database back to a state that it existed at a given block, including the spentness of each transaction This naturally means "unwinding" each block backwards.

davecgh commented 9 years ago

Comment by davecgh Friday Oct 24, 2014 at 17:45 GMT


All of the above said, I think using NoSQL as a datastore (or any datastore with lax atomicity) is an extremeley bad idea for financial based systems that must be accurate. What would happen if you have added a new block and updated the resulting spentness change (utxo updates), but because of lack of real atomicity you end up only actually storing half the spend changes? That would ultimately guarantee you end up forking sometime in the future because your view of the utxo set is inaccurate.

davecgh commented 9 years ago

Comment by jongillham Sunday Oct 26, 2014 at 19:22 GMT


Thanks for laying this out. It feels much more intuitive to me to have the best chain logic in btcchain. Do you think these changes are best done with a slow evolution of the codebase/APIs or by several big commits that change the functionality straight away?

I misspoke when I mentioned lax transaction/atomicity guarantees of NoSQL datastores. I should have written limited transaction/atomicity functionality. For example some datastores only allow locking of a couple of 'rows' at a time - which makes implementing Db.DropAfterBlockBySha no-trivial.

You are absolutely right that financial based systems must be accurate. I remember reading of at least one large exchange that lost a lot of bitcoins by people exploiting their wallet which didn't use transactional support correctly.

davecgh commented 9 years ago

Comment by davecgh Sunday Oct 26, 2014 at 20:21 GMT


It can be somewhat evolutionary, but the second and third items above are very closely related and will likely end up being largish changes.

However, I did try to lay the points out in the order that they can be addressed. The first point can (and should) be done independently. I'd say the first step can be broken down into the following manageable pieces:

a) Make btcchain safe for concurrent access b) Expose a BestTip() function c) Migrate all callers (outside of btcchain and perhaps the initial db load in btcd that reports where the db is) that call btcdb.NewestSha to use the new BestTip function d) Migrate btcd code base to use the now (after step a) concurrent-safe chain by pulling out the additional plumbing that the block manager has for things like CalcNextRequiredDifficulty and CheckConnectBlock and calling them directly on the chain instance

davecgh commented 9 years ago

Comment by davecgh Tuesday Dec 23, 2014 at 03:28 GMT


@jongillham Please see my latest comments in #6 about the currently proposed interface overhaul. While the particulars are slightly different than those discussed here, the end result should be the ability to more easily create a BigTables backend. I'd like your thoughts on the proposed design since you've expressed interest in the ability to write said backend.