Closed mattsse closed 3 years ago
Just tested this, for 100 blocks it took me 196 seconds on master (from a remote node), whereas with 10 tasks it took 23s and with 25 tasks 20s. This PR seems to do >90% of the work towards parallelizing the network layer / block processing. Sick work, thank you.
A next step here would be for us to further improve the database layer by batch inserting. I think the way to do this is by storing N Evaluation
s in MevDB in-memory, and then COPY each batch of N evaluations to the db, instead of INSERT
ing each time as it can be inefficient
Amazing! Thank you @mattsse!
Motivation
This attempts to fix #24 by adding support for spawning the processing of blocks
Solution
Introduced new
Stream
types:BatchEvaluator
: aStream: Send+Sync
that processes multiple blocks and their inspections and yields theEvaluation
dyn Inspector
s anddyn Reducer
s are therefor also required to beSend + Sync
BatchInserts
: takes aStream
ofEvaluations
and puts them in the DBThe
BatchEvaluator
can be spawned to a new task, a newtask
option in theBlockOpts
allows to control how manyBatchEvaluator
s should be spawned, the range of blocks is then divided equally to theBatchEvaluator
s which all pipe theirEvaluation
s via channels to theBatchInserts
. Right now a singleMevDB
handle is used, but it would be possible to add more.Since I don't have access to an archive node, I wasn't able to test that yet. Any Tips on how I can test this would be appreciated 🙃
two more
batch
subcommand options are added:tasks
: how many task should be used for fetching all the infomax-requests
: how many requests each task is allowed to execute concurrently