[database] Support multiple (non-best) chains, concurrent reads on different forks.

Issue by jrick Thursday Jul 17, 2014 at 04:34 GMT Originally opened as https://github.com/btcsuite/btcdb/issues/6

The database currently only saves the best chain. If a fork extends beyond the current best chain, the reorged blocks will be removed from the database. This makes the database unsafe for multiple callers who concurrently expect their given block range to be valid.

For example, when btcd must rescan some range of blocks, and the tip of the range is removed from the best chain due to a new best block, db lookups will begin failing for missing the block for some hash, or worse (if only heights are used), lookups will silently continue with no indication that previously processed blocks may not be on the same fork.

A better solution would be to use an iterator which allows walking through some range of blocks in a fork. Even if the best chain is switched, these blocks can still be iterated over. After the iteration is finished, the caller can finish (it returned results about the chain at the time the request was made) or a switch back to the main chain and/or best block can be made if more blocks were attached.

Comment by jongillham Friday Sep 19, 2014 at 11:28 GMT

I don't fully understand the issue here but would it mean that Db.DropAfterBlockBySha(*btcwire.ShaHash) (err error) would no longer be needed? If so, it would have massive positive implications for a whole host of NoSQL key/value stores that are currently unable to meet the transactional requirements of DropAfterBlockBySha.

Comment by davecgh Friday Sep 19, 2014 at 13:56 GMT

In short, the ask here is for the database to support storage of orphan blocks/side chains. Currently, it only ever stores blocks which are on the main chain. Changing this will require a pretty significant amount of changes in other packages because they all work off the documented assumption that asking the database for blocks and transactions are only available if they have already been fully validated and are in the current best (main) chain.

The description here is extremely misleading because it's referring to btcd when it should be referring to RPC. The database has a lock which prevent reorgs out from under callers in btcd itself, so the assertion that "this makes the database unsafe for multiple callers who concurrently expect their given block range to be valid" is not accurate for btcd. What this is referring to is RPC, which is always a snapshot of the current state. In that case, the statement is correct.

That said, we're ultimately going to want to allow storage of the side chain blocks as it will allow a few more advanced techniques to be used for optimizations and allow RPC access to information about reorged chains which is really what this issue is asking for.

Comment by jongillham Friday Sep 19, 2014 at 14:29 GMT

Ok thanks for the explanation I understand now.

I completely agree that storing side chains is a good aim. Being able to detect and monitor side chains as they develop on the Bitcoin network also has potential security benefits for clients of btcd who want to limit their exposure to chain reorgs.

A consequence of storing side chains might make it possible for the btcdb.Db interface to be completely non-destructive to blocks and transactions. It would therefore make it much easier/possible to implement btcdb.Db drivers for the NoSQL databases that have limited transaction support.

Comment by davecgh Tuesday Dec 23, 2014 at 03:21 GMT

As an update to this, the current plan is to start tackling this within the next few weeks. We have been discussing several of the design implications and the current plan is as follows:

The btcdb interface is going to change quite significantly. The current interface and approach has a few undesirable properties such as the following:

The problem this issue discusses
The current interface is based on a fairly specific interface for block storage with an assumed high level of intelligence built-in to each backend driver.
This leads to all back-end implementations (ldb and memdb, currently) needing to repeat a lot of logic that is non-trivial to get right and really should only be done once at a higher layer
- As a concrete example, adding new functionality, such as the optional address index work being done by @Roasbeef, currently has to be implemented in each backend driver.

To resolve the above issues (and others not covered), the current plan is to redesign the interface so it provides better scalability, more easily facilitates new backends such as BigTables for GAE, and opens the door for supporting more advanced features such as block pruning. To accomplish this, the idea is to have the new interface provide two major categories as follows:

Atomic operations for metadata such as the the block and transaction location metadata, the utxo set, and an optional address index.
- The interface for this functionality is currently planned to be very similar to the recently added walletdb interface.
Raw block storage
- Since the metadata and block data will ultimately not necessarily live in the same place, the interface for raw block storage needs to be separate from the metadata storage mechanism in the previous point
- This will NOT contain additional logic to ensure the block connects to an existing one or track spends like the current interface

Finally, these changes mean that most of the current Bitcoin-specific intelligence, like spend tracking, ensuring the block being inserted extends the main chain, and unspending transactions when blocks are dropped, will no longer live in btcdb and instead will be moved to a higher layer where they belong.

Comment by Roasbeef Tuesday Dec 23, 2014 at 06:22 GMT

Overall, I'm very pleased with @davecgh's proposed changes to btcdb. We've been discussing/brainstorming the potential API changes via IRC over the past fews days as I've started work on adding the optional address index to btcd.

Some of the design flaws/quirks with the current interface have been made rather apparent as I've been taking a more critical look at btcdb's codebase in order to add the new functionality along with complete test coverage.

IMO the new separation of concerns (metadata vs blocks) that the proposed changes would bring about are much needed. The old interface appears to have been designed primarily with LevelDB in mind, as well as storage of metadata and blocks stuffed into the same location (which forces all drivers to have DropAfterBlockBySha implemented). btcchain would also need to be altered behavior wise to use the new interface. Having a set of core, atomic operations for implementing metadata related functionality (indexes, spend data, etc.) will make such features much easier to add in the future. Additionally, it should also allow various drivers a higher degree of freedom implementation wise. As of now, adding a new btcdb feature entails adding a new general, opaque function to the db interface, tightly coupling the various database drivers together. As @davecgh said, this process results in unnecessary duplication when targeting new features for a specific driver.

I look forward toward to the ultimate interface re-write whatever shape it may take. A more flexible interface a may attract new contributors seeking to add alternative databases for btcd's backend store.

Comment by jongillham Monday Dec 29, 2014 at 04:35 GMT

It was great to read @davecgh's btcdb proposal above. I think it's the right direction to go and I have found that this separation of concerns can work well with NoSQL datastores.

Here's the type of architecture that, from my experience, is compatible with concurrent block chain reads and NoSQL datastores:

type Db interface {
    FetchBlockByHash(blockHash *btcwire.ShaHash) (*btcutil.Block, error)
    FetchTxByHash(txHash *btcwire.ShaHash) (*btcwire.MsgTx, error)

    // There can be more than one block at a certain height.
    FetchBlocksByHeight(height int64) ([]*btcutil.Block, error)

    FetchHeighestBlocks() ([]*btcutil.Block, error)

    // A transaction can be in more than one block if a side branch contains 
    // it as well as the main branch.
    FetchTxBlocks(txHash *btcwire.ShaHash) ([]*btcutil.Block, error)

    // Returns a list of transaction hashes that reference txOut with their inputs.
    // Note that the presence of side branches means that TxOut can be
    // referenced in more than one transaction.
    FetchTxOutReferences(txOut *btcwire.TxOut) ([]*btcwire.ShaHash, error)

    InsertBlock(block *btcutil.Block) error
}

There are a couple of points to make with the above interface:

It embraces the fact that if we store side branches then multiple blocks can exist at the same height, one transaction can be in multiple blocks and one TxOut can be in multiple transactions.
InsertBlock needs to do an awful lot of work however it does not need to be done atomically and can be done in a roll forward idempotent way. This is especially important when considering NoSQL datastores where table locking is not an option. The following is the work InsertBlock needs to do and the order it could be done:
1. Insert each transaction into a tx hash index.
2. Iterate through all the new TxOuts for each block transaction and add them to a TxOut index.
3. Iterate through all the new TxIns and update previous TxOuts with a reference for FetchTxOutReferences.
4. Insert the block into a block hash index.
5. Insert the block into a height index.

The following is the functionality that should be calculated at a higher level by something like btcchain and not be pre calculated or stored in the datastore:

BestTip() (*btcutil.Block, error)
IsMainBranch(blockHash *btcwire.ShaHash) (bool, error)
IsUnspent(txOut *btcwire.TxOut) (bool, error)

BestTip() is calculated by iterating through the block chain and using the rules already in btcchain to select a winning block if there are two competing blocks at the same height. IsMainBranch() returns true for blockHash if it is the only block at that height. Otherwise it must iterate up the chain until it determines if blockHash is on the longest branch. If we reach the tip of the block chain then BestTip() can be used if multiple blocks are still competing. IsUnspent is calculated by calling Db.FetchTxOutReferences() for the specified TxOut and iterating through each referenced transaction to see if any of them are on a block on the main branch. If any are then the TxOut is considered spent.

Judicious use of caching for the above three functions could be possible so long as the cache is wiped as soon as InsertBlock updates the block height index.

Anyway, I am not sure this solution exactly addresses @davecgh's proposal however I am looking forward to see how it all evolves.

btcsuite / btcd

[database] Support multiple (non-best) chains, concurrent reads on different forks. #255