IntersectMBO / cardano-db-sync

A component that follows the Cardano chain and stores blocks and transactions in PostgreSQL
Apache License 2.0
283 stars 158 forks source link

Query and insert hashes in parallel #1739

Open kderme opened 2 weeks ago

kderme commented 2 weeks ago

Most of the work in db-sync is done sequentially. We could increase its parallelism by doing queries on a separate thread, that has a second connection to the db. For each list of blocks, the thread could extract the stake_address, multiasset,pool_hash and other hash keys that need to be resolved and don't rollback. It can try to resolve them or find them in cache. It will also update a Map.The main thread will also have a access to this map and for each key, it will wait for its resolution on an STM. The main thread will no longer use the cache.

Having 2 open transactions opens the question of what happens if there is a crash and the one is commited and not the other. We can make sure the the main thread waits for the second before commiting. If there is a crash after that and before the main thread has committed it shouldn't cause problems, since the tables it populates never rollback.