fungibit / chainscan

Feel the blockchain, one transaction at a time.
MIT License
5 stars 1 forks source link

Prefetch txs in a background thread #11

Open fungibit opened 7 years ago

fungibit commented 7 years ago

In "tight" iter_txs() loops, much of the time is spent deserializing the transactions.

In cases like this ...

for tx in iter_txs():
    some_io_bound_task(tx)

... it is useful to deserialize the txs of the next block in the bg, while the IO task runs.

Even in cases like this ...

for tx in iter_txs():
    some_cpu_bound_task(tx)

... it can also help, because some of the deserialization code runs with nogil, so this can utilize more than one core concurrently. (It is hard to estimate if the benefit is significant in this case, though. My intuition is it can give up to 20-30% speedup, with little effort).

This option should be enabled by default in iter_txs(), and disabled by default in iter_blocks().

fungibit commented 7 years ago

I profiled the time the GIL is held:

for tx in iter_tx(): pass -- 73% for tx in iter_tx(track_spending=True): pass -- 82% for tx in iter_tx(track_scripts=True): pass -- 86%

This means that, for CPU-bound tasks, we can gain up to 14%-27% speedup by prefetching in a bg thread. For IO-bound tasks, the gain can be higher, depending on the task.