Closed insipx closed 4 years ago
Solved by re-executing the block and getting storage changes this way. We no longer query for storage from stored changes in RocksDB. This proved to be a faster way to index storage, and less intense then modifying the rocksdb stack starting with rust-rocksdb. It was also ambiguous whether MultiGet
would actually offer improved speeds over gets via iteration.
There are some optimizations which can be done in order to speed up current storage indexing.
Ideas:
Could subscribe to storage in conjunction with
query_storage
from the client.Rework query_storage
[Currently working on] Don't use the Substrate Client to query for storage. Instead, create a thin wrapper around the State Backend,
ReadOnlyDatabase
, acting directly with the RocksDB Secondary Instance. This requires research around how substrate chooses keys/prefixes in RocksDB.MultiGet
. Async Rocksdb Gets have the potential to significantly decrease the latency of requests.From RocksDB Wiki:
There is a lot of complexity in the underlying RocksDB implementation to lookup a key. The complexity results in a lot of computational overhead, mainly due to cache misses when probing bloom filters, virtual function call dispatches, key comparisons and IO. Users that need to lookup many keys in order to process an application level request end up calling Get() in a loop to read the required KVs. By providing a MultiGet() API that accepts a batch of keys, it is possible for RocksDB to make the lookup more CPU efficient by reducing the number of virtual function calls and pipelining cache misses. Furthermore, latency can be reduced by doing IO in parallel.
Let us consider the case of a workload with good locality of reference. Successive point lookups in such a workload are likely to repeatedly access the same SST files and index/data blocks. For such workloads, MultiGet provides the following optimizations -
https://github.com/facebook/rocksdb/wiki/MultiGet-Performance
This seems like it fits the use-case of storage queries particularly well. We are both getting lots of keys, and then looking up the data for that key individually
Blocked on:
rust-rocksdb
doesn't yet wrap 'MultiGet'. So modifying rust-rocksdb wrapper and then kvdb-rocksdb parity wrapper over rust-rocksdb is requiredget_iter
instead, sinceIterator->Next
in RocksDB executes in constant time