Open pipermerriam opened 4 years ago
As a rough baseline, I am currently importing the chain at around block 4-million. At this height I'm seeing performance at around 4 blocks-per-second and 700 rows per second (a row being a single row in the database for the full block data).
What was wrong?
The ORM data model is setup with the following loose constraints.
Header
has a nullable foreign key to it's parentBlock
must point to a headerTransaction
optionally point to a block.Receipt
must point to a transactionLog
must point to a receiptCurrently, to import a block we build and bunk save this entire hierarchy for a single block. Each block is imported sequentially and cannot be done concurrently due to the foreign key constraint to the parent block.
However, since Headers can have a null parent and transaction can have a null block, we should be able to add a level of concurrency for improved efficiency of data loading.
How can it be fixed?
We should be able to adjust our pipeline such that:
null
parent pointer.Before doing this we need some benchmarks in place to measure performance. I would suggest we benchmark against a wide range of real mainnet blocks.