Decouple trie updates from write transactions

Mirko-von-Leipzig commented 6 months ago

Our trie update logic is currently tied directly to a database write transaction. This means that updating the state trie for a given block can only occur in the same transaction that is inserting the block into storage. This also means that the next block's trie update can only begin once the current block has been inserted into storage. This prevents pipelining of block processing because each block is waiting for the state trie update of its parent to be inserted into storage. Currently this is achieved by simply performing the trie update in the same process as inserting the rest of the block data - which kind of sucks because it is also quite expensive.

Basically, for every other bit of data we can separate the processing from the writing to storage. Except for state trie updates. Your job is to fix this :)

Why is it like this

When a trie is updated, it has a set of new nodes that must be inserted into storage. Each requires a unique node ID from storage for the trie graph to use to reference that node. The IDs are generated when the nodes are inserted into storage.

This means the next update must first wait for those IDs to be generated, and for those nodes to be stored - otherwise the next update might attempt to read one of these new nodes which would fail.

Proposed solution

Add a new method to storage which returns a set of new node indices for the trie update to use. Importantly, this would not rely on a db transaction, but rather something like a Arc<Mutex<u64>> (for each trie table).

gets initialized at node startup with SELECT MAX(index) from xxx_trie
fn get_indices(N: usize) -> Vec<u64> (or a range or whatever), returns the next N indices

This relies on indices only ever increasing i.e. if things get deleted, no using of the fragmented holes etc, that would require db transactions which is no bueno.

This will enable the following pattern:

Trie updater has N new nodes
Request and receive N indices
Assign indices to the new nodes
Trie update result can then be passed on to the actual writer task

Crucially, steps 1-3 do not have to be part of the writer process.

Additional considerations

Because the updater is now decoupled from the writer task, this means the updater no longer knows when the new nodes will enter storage. This means he has to keep a local cache of the nodes he has emitted for storage, until he is certain they are actually stored. One way to do this is to check the database at the start of each update, to check if the cached nodes are in storage yet - if yes, they can be dropped from the cache.

This caching gets a bit trickier if you consider that contract tries can be updated concurrently - so they must share a cache.

... there was another point but it has slipped my mind...

kkovaacs commented 5 months ago

FYI: SQLite insert performance is significantly worse when using an externally generated integer primary key. The reason for this is that SQLite needs to check the uniqueness of the data we're passing in -- it can skip this check when generating a unique id on its own.

Mirko-von-Leipzig commented 5 months ago

FYI: SQLite insert performance is significantly worse when using an externally generated integer primary key. The reason for this is that SQLite needs to check the uniqueness of the data we're passing in -- it can skip this check when generating a unique id on its own.

Hmm. That's a bummer but must surely be solvable somehow..

pragma_ignore_check_constraints looks like the thing to use?

kkovaacs commented 1 month ago

That pragma seems to affect only CHECK constraints in the schema definition according to the docs:

CHECK constraints are only verified when the table is written, not when it is read. Furthermore, verification of CHECK constraints can be temporarily disabled using the "PRAGMA ignore_check_constraints=ON;" statement. Hence, it is possible that a query might produce results that violate the CHECK constraints.

eqlabs / pathfinder