Block verification logic is commingled with data persistence logic

When manually inspecting block validation code, we noticed that the block validation logic is done in pieces, with a sqlite database as an intermediary. Or in other words, the logic for data validation and the logic for data persistence are commingled.

In particular, there is a "staging" sqlite database that appears to be a scratch space for blocks that haven't yet been incorporated into the main block store. The process_next_staging_block function is eventually called to process and (partially) validate aspects of a staged block:

https://github.com/trailofbits/x-audit-blockstack-core/blob/e2d3d5bae539d242851620e28129af6c4a9de642/src/chainstate/stacks/db/blocks.rs#L2066-L2073

However, note that in the implementation of that function, it is assumed that what is coming out of the staging database can be trusted, and that it's already been (partially) validated:

https://github.com/trailofbits/x-audit-blockstack-core/blob/e2d3d5bae539d242851620e28129af6c4a9de642/src/chainstate/stacks/db/blocks.rs#L2142-L2144

There are two downsides to this design (where data validation logic and data persistence logic are commingled):

It makes it difficult to reason about the correctness of the data validation logic.
It trusts what comes out of the persistence layer — though this may be unfounded! An attacker with access to the filesystem of an elected Blocks leader could modify data in between the time it is written to the persistence layer and the time when the Blocks node re-reads that data. This vector could be used, for example, to modify the contents of microblocks that the leader will commit to the blockchain. (Note that this seems like a high-difficulty, sophisticated attack.)

Our recommendation is that the block & transaction validation logic be cleanly separated from the data persistence logic, and that the validation logic is performed after data is retrieved from the persistence layer.

In short: do not trust what you're reading from the persistence layer to have integrity or authenticity; instead, verify those properties as data is retrieved.

diwakergupta / stacks-blockchain-tob-audit

Block verification logic is commingled with data persistence logic #9