Closed moodyjon closed 2 years ago
I tried the prefetching approach, and did not get significant improvement. I was only able to write prefetching logic (pre_advance_block()
) for a handful of tables, as the logic of advance_block()
is complicated enough to make it very difficult to replicate.
I also tried multi_put/multi_delete for selected tables. The code is in pull request #88. I didn't see significant performance improvement in a 1hr run, though @jackrobison may be able to measure the effect.
Finally, I tried disabling the expensive part of RevertableOpStack where it fetches prior values from the DB to create undo records. For test purposes, I disabled the fetching when --reorg_limit=0
, though in any real scenario we would want to compare the DB height with LBCD height to decide whether reorg is a practical concern and the checks should be turned on because we're close enough to the tip of chain.
Result over 1hr runs showed...
Safe: (931553-930114)=1439 blocks Unsafe: (932559-930114)=2445 blocks (70% more blocks)
The baseline and unsafe scenarios were both evaluated on top of PR #88.
py-spy top (baseline):
%Own %Total OwnTime TotalTime Function (filename)
0.00% 74.00% 0.000s 2425s run (threading.py)
0.00% 74.00% 0.000s 2425s _bootstrap_inner (threading.py)
0.00% 74.00% 0.000s 2425s _bootstrap (threading.py)
0.00% 74.00% 0.070s 2425s _worker (concurrent/futures/thread.py)
0.00% 74.00% 0.060s 2425s run (concurrent/futures/thread.py)
1.00% 74.00% 5.01s 2425s advance_block (hub/scribe/service.py)
1.00% 55.00% 13.73s 1471s _get_takeover_ops (hub/scribe/service.py)
45.00% 47.00% 1187s 1243s iterate (hub/db/interface.py)
1.00% 47.00% 6.18s 1223s _cached_get_active_amount (hub/scribe/service.py)
0.00% 46.00% 3.06s 1173s _get_pending_claim_amount (hub/scribe/service.py)
0.00% 45.00% 33.68s 1165s get_active_amount_as_of_height (hub/db/db.py)
14.00% 15.00% 542.3s 563.0s get (hub/db/interface.py)
7.00% 7.00% 352.2s 353.9s multi_get (hub/db/interface.py)
0.00% 7.00% 0.210s 342.4s stage_multi_put (hub/db/interface.py)
0.00% 7.00% 4.86s 337.6s multi_put (hub/db/revertable.py)
0.00% 6.00% 8.84s 323.9s append_op (hub/db/revertable.py)
0.00% 6.00% 2.63s 278.3s stage_put (hub/db/interface.py)
0.00% 9.00% 0.730s 265.6s _add_claim_or_support (hub/scribe/service.py)
0.00% 9.00% 1.19s 248.1s _add_claim_or_update (hub/scribe/service.py)
0.00% 6.00% 1.34s 177.8s get_add_claim_utxo_ops (hub/scribe/service.py)
0.00% 0.00% 0.070s 126.8s _get_update_hashX_histories_ops (hub/scribe/service.py)
0.00% 0.00% 0.260s 123.0s _spend_claim_or_support_txo (hub/scribe/service.py)
0.00% 4.00% 2.56s 93.21s get_claim_txo (hub/db/db.py)
0.00% 1.00% 0.300s 88.75s _get_pending_effective_amount (hub/scribe/service.py)
0.00% 1.00% 0.150s 78.73s <dictcomp> (hub/scribe/service.py)
0.00% 0.00% 0.490s 78.65s get_delayed_activate_ops (hub/scribe/service.py)
0.00% 0.00% 1.30s 74.84s _get_cumulative_update_ops (hub/scribe/service.py)
0.00% 1.00% 1.33s 68.13s spend_utxo (hub/scribe/service.py)
0.00% 0.00% 0.290s 67.32s get_activate_ops (hub/scribe/service.py)
0.00% 0.00% 0.490s 63.68s _spend_support_txo (hub/scribe/service.py)
0.00% 0.00% 0.520s 59.09s _spend_claim_txo (hub/scribe/service.py)
0.00% 0.00% 0.750s 58.37s stage_delete (hub/db/interface.py)
0.00% 1.00% 0.530s 53.20s _get_pending_supported_amount (hub/scribe/service.py)
0.00% 2.00% 5.10s 52.38s <lambda> (hub/db/interface.py)
0.00% 1.00% 0.270s 52.26s _get_active_amount (hub/db/db.py)
1.00% 1.00% 1.54s 51.99s <genexpr> (hub/db/db.py)
0.00% 2.00% 2.23s 41.21s get_future_activated (hub/db/db.py)
1.00% 2.00% 31.22s 39.16s unpack_value (hub/db/prefixes.py)
0.00% 0.00% 0.370s 33.88s get_pending_tx_num (hub/scribe/service.py)
0.00% 0.00% 0.620s 33.51s get_tx_num (hub/db/db.py)
0.00% 0.00% 29.52s 29.52s iterator (hub/db/interface.py)
0.00% 0.00% 0.170s 29.22s get_remove_activate_ops (hub/scribe/service.py)
0.00% 0.00% 27.23s 28.46s get_pending (hub/db/interface.py)
0.00% 0.00% 0.340s 27.53s stage_multi_delete (hub/db/interface.py)
0.00% 0.00% 0.070s 25.99s get_cached_claim_exists (hub/db/db.py)
0.00% 0.00% 0.360s 25.92s get_cached_claim_hash (hub/db/db.py)
0.00% 0.00% 3.36s 25.20s multi_delete (hub/db/revertable.py)
0.00% 2.00% 0.060s 24.84s get_activation (hub/db/db.py)
0.00% 0.00% 0.660s 24.43s get_claims_for_name (hub/db/db.py)
0.00% 0.00% 0.600s 22.55s get_url_effective_amount (hub/db/db.py)
0.00% 0.00% 18.03s 21.15s unpack_key (hub/db/prefixes.py)
0.00% 0.00% 15.59s 20.14s unsafe_commit (hub/db/interface.py)
0.00% 1.00% 0.880s 19.70s add_utxo (hub/scribe/service.py)
0.00% 0.00% 0.300s 19.46s get_remove_claim_utxo_ops (hub/scribe/service.py)
0.00% 1.00% 0.860s 18.82s hashX_from_txo (hub/scribe/network.py)
1.00% 1.00% 11.14s 18.36s pack_key (hub/db/prefixes.py)
0.00% 0.00% 0.560s 16.48s _add_support (hub/scribe/service.py)
0.00% 0.00% 0.320s 13.41s __get__ (functools.py)
0.00% 0.00% 0.230s 13.21s get_supported_claim_from_txo (hub/db/db.py)
0.00% 0.00% 0.160s 13.09s metadata (hub/scribe/transaction/__init__.py)
0.00% 0.00% 0.160s 13.00s get_controlling (hub/scribe/service.py)
0.00% 0.00% 0.240s 12.93s from_bytes (hub/schema/claim.py)
0.00% 0.00% 0.100s 12.84s get_controlling_claim (hub/db/db.py)
0.00% 0.00% 0.330s 12.69s from_bytes (hub/schema/base.py)
0.00% 0.00% 0.310s 11.34s get_claim_txos_for_name (hub/db/db.py)
0.00% 0.00% 0.140s 10.91s ParseFromString (protobuf/message.py)
0.00% 0.00% 0.000s 10.77s run (hub/service.py)
0.00% 0.00% 0.000s 10.77s _run_code (runpy.py)
0.00% 0.00% 0.000s 10.77s <module> (hub/scribe/__main__.py)
0.00% 0.00% 0.000s 10.77s main (hub/scribe/__main__.py)
py-spy top (unsafe):
%Own %Total OwnTime TotalTime Function (filename)
0.00% 23.00% 0.000s 2924s _bootstrap (threading.py)
0.00% 23.00% 0.110s 2924s _worker (concurrent/futures/thread.py)
0.00% 23.00% 0.000s 2924s run (threading.py)
0.00% 23.00% 0.000s 2924s _bootstrap_inner (threading.py)
0.00% 23.00% 0.140s 2924s run (concurrent/futures/thread.py)
0.00% 23.00% 6.11s 2923s advance_block (hub/scribe/service.py)
0.00% 22.00% 22.41s 2385s _get_takeover_ops (hub/scribe/service.py)
21.00% 21.00% 2039s 2137s iterate (hub/db/interface.py)
0.00% 21.00% 10.07s 2094s _cached_get_active_amount (hub/scribe/service.py)
0.00% 21.00% 5.23s 2002s _get_pending_claim_amount (hub/scribe/service.py)
0.00% 21.00% 58.02s 1987s get_active_amount_as_of_height (hub/db/db.py)
2.00% 2.00% 356.6s 389.9s get (hub/db/interface.py)
0.00% 1.00% 0.420s 163.8s _get_pending_effective_amount (hub/scribe/service.py)
0.00% 2.00% 4.01s 152.4s get_claim_txo (hub/db/db.py)
0.00% 0.00% 0.320s 146.6s <dictcomp> (hub/scribe/service.py)
0.00% 1.00% 2.25s 120.0s _get_cumulative_update_ops (hub/scribe/service.py)
0.00% 0.00% 1.20s 111.6s spend_utxo (hub/scribe/service.py)
0.00% 0.00% 0.210s 106.6s _spend_claim_or_support_txo (hub/scribe/service.py)
0.00% 0.00% 0.570s 105.0s _add_claim_or_support (hub/scribe/service.py)
0.00% 0.00% 1.65s 100.6s _add_claim_or_update (hub/scribe/service.py)
0.00% 0.00% 1.10s 98.94s _get_pending_supported_amount (hub/scribe/service.py)
0.00% 0.00% 0.730s 96.99s _get_active_amount (hub/db/db.py)
0.00% 0.00% 3.38s 96.26s <genexpr> (hub/db/db.py)
0.00% 0.00% 8.25s 91.40s <lambda> (hub/db/interface.py)
0.00% 0.00% 4.35s 75.15s get_future_activated (hub/db/db.py)
0.00% 0.00% 50.56s 63.86s unpack_value (hub/db/prefixes.py)
0.00% 0.00% 0.560s 60.42s _spend_claim_txo (hub/scribe/service.py)
0.00% 0.00% 52.56s 52.56s iterator (hub/db/interface.py)
0.00% 0.00% 0.660s 50.19s get_pending_tx_num (hub/scribe/service.py)
0.00% 0.00% 0.800s 49.53s get_tx_num (hub/db/db.py)
0.00% 0.00% 1.17s 46.49s get_claims_for_name (hub/db/db.py)
0.00% 0.00% 0.740s 45.95s _spend_support_txo (hub/scribe/service.py)
0.00% 0.00% 1.33s 43.75s get_url_effective_amount (hub/db/db.py)
0.00% 0.00% 37.64s 39.23s get_pending (hub/db/interface.py)
0.00% 0.00% 32.16s 38.42s unpack_key (hub/db/prefixes.py)
0.00% 0.00% 28.71s 36.79s unsafe_commit (hub/db/interface.py)
0.00% 0.00% 0.100s 36.13s get_cached_claim_exists (hub/db/db.py)
0.00% 0.00% 0.340s 36.03s get_cached_claim_hash (hub/db/db.py)
0.00% 0.00% 0.200s 35.28s get_activation (hub/db/db.py)
0.00% 0.00% 19.89s 29.40s pack_key (hub/db/prefixes.py)
0.00% 0.00% 0.960s 23.54s add_utxo (hub/scribe/service.py)
0.00% 0.00% 0.760s 22.58s hashX_from_txo (hub/scribe/network.py)
0.00% 0.00% 0.790s 21.92s get_delayed_activate_ops (hub/scribe/service.py)
0.00% 0.00% 2.64s 19.31s stage_put (hub/db/interface.py)
0.00% 0.00% 0.580s 18.57s __get__ (functools.py)
0.00% 0.00% 0.120s 17.99s metadata (hub/scribe/transaction/__init__.py)
0.00% 0.00% 0.130s 17.87s from_bytes (hub/schema/claim.py)
0.00% 0.00% 0.370s 17.74s from_bytes (hub/schema/base.py)
0.00% 0.00% 0.270s 17.64s get_controlling (hub/scribe/service.py)
0.00% 0.00% 0.090s 17.37s get_controlling_claim (hub/db/db.py)
0.00% 0.00% 0.760s 17.24s _read_tx_parts (hub/scribe/transaction/deserializer.py)
0.00% 0.00% 0.020s 16.87s run_forever (asyncio/base_events.py)
0.00% 0.00% 0.000s 16.87s _run_code (runpy.py)
0.00% 0.00% 0.000s 16.87s run (hub/service.py)
0.00% 0.00% 0.000s 16.87s main (hub/scribe/__main__.py)
0.00% 0.00% 0.000s 16.87s <module> (hub/scribe/__main__.py)
0.00% 0.00% 0.000s 16.87s run_until_complete (asyncio/base_events.py)
0.00% 0.00% 0.000s 16.87s _run_module_as_main (runpy.py)
0.00% 0.00% 0.070s 16.85s _run_once (asyncio/base_events.py)
0.00% 0.00% 0.300s 16.84s get_supported_claim_from_txo (hub/db/db.py)
0.00% 0.00% 0.040s 16.76s _run (asyncio/events.py)
0.00% 0.00% 0.340s 16.32s get_claim_txos_for_name (hub/db/db.py)
0.00% 0.00% 0.280s 16.09s <listcomp> (hub/scribe/transaction/deserializer.py)
0.00% 0.00% 0.070s 15.38s ParseFromString (protobuf/message.py)
0.00% 0.00% 0.230s 15.06s MergeFromString (protobuf/internal/python_message.py)
0.00% 0.00% 1.42s 14.83s InternalParse (protobuf/internal/python_message.py)
0.00% 0.00% 2.68s 14.78s <listcomp> (hub/scribe/service.py)
0.00% 0.00% 0.070s 14.76s to_message_bytes (hub/schema/base.py)
0.00% 0.00% 0.170s 14.69s SerializeToString (protobuf/internal/python_message.py)
0.00% 0.00% 0.170s 13.97s stage_multi_put (hub/db/interface.py)
0.00% 0.00% 0.700s 13.33s process_blocks_and_mempool_forever (hub/scribe/service.py)
0.00% 0.00% 0.010s 12.63s check_and_advance_blocks (hub/scribe/service.py)
0.00% 0.00% 0.350s 12.53s SerializePartialToString (protobuf/internal/python_message.py)
0.00% 0.00% 0.190s 12.36s _read_outputs (hub/scribe/transaction/deserializer.py)
0.00% 0.00% 0.460s 12.34s verify_signed_metadata (hub/scribe/network.py)
0.00% 0.00% 1.72s 12.18s InternalSerialize (protobuf/internal/python_message.py)
0.00% 0.00% 0.470s 12.14s pay_to_address_script (hub/scribe/network.py)
0.00% 0.00% 0.000s 12.10s read_tx_block (hub/scribe/transaction/deserializer.py)
0.00% 0.00% 0.000s 12.10s block (hub/scribe/network.py)
0.00% 0.00% 0.030s 12.04s read_tx_and_hash (hub/scribe/transaction/deserializer.py)
0.00% 0.00% 0.570s 11.91s _read_output (hub/scribe/transaction/deserializer.py)
Jack says the revert-ability checks could be reduced to constant time complexity if we want
I'm running
scribe
to do a full build of the hub databases. (Yes I know that the result may be obsoleted soon by addition of new tables.)What I see is that
BlockchainProcessorService.advance_block()
is IO-bound by the need to read from random records. It reads the old values for keys before they are deleted/modified to make RevertableDelete and RevertablePut ops. It may also read rows for other purposes.Since I'm running both
scribe
andlbcd
in ~10GB memory, there is very little left over for filesystem caching. Moreover, there's no way to anticipate what rows of the 100GB RocksDb will be needed.Prefetcher
is currently prefetching the raw block contents fromlbcd
. This zeros out the cost to contactlbcd
for the next block(s). We should be unpacking the blocks, and pre-fetching the RocksDb rows which will (probably) be needed later inadvance_block()
. The row values don't matter, I just want to pull these into the FS cache or RocksDB cache.Data collected with "py-spy top":