Speed up Storage Indexing

There are some optimizations which can be done in order to speed up current storage indexing.

Ideas:

Could subscribe to storage in conjunction with query_storage from the client.
- Requires a new actor that would search for 'missing' storage entries. This is more challenging than 'missing blocks' since there's no easy way to determine what storage is missing without heavily querying the Substrate Chain, which defeats the purpose. It could be possible to figure out missing storage entries from the database, but this seems like a flawed implementation.
Rework query_storage
- not sure how I would go about this, don't have a solid plan
[Currently working on] Don't use the Substrate Client to query for storage. Instead, create a thin wrapper around the State Backend, ReadOnlyDatabase, acting directly with the RocksDB Secondary Instance. This requires research around how substrate chooses keys/prefixes in RocksDB.
- Using RocksDB directly has the advantage of reaping the benefits of the new Linux IO_URING interface with MultiGet. Async Rocksdb Gets have the potential to significantly decrease the latency of requests.

From RocksDB Wiki:

There is a lot of complexity in the underlying RocksDB implementation to lookup a key. The complexity results in a lot of computational overhead, mainly due to cache misses when probing bloom filters, virtual function call dispatches, key comparisons and IO. Users that need to lookup many keys in order to process an application level request end up calling Get() in a loop to read the required KVs. By providing a MultiGet() API that accepts a batch of keys, it is possible for RocksDB to make the lookup more CPU efficient by reducing the number of virtual function calls and pipelining cache misses. Furthermore, latency can be reduced by doing IO in parallel.

Let us consider the case of a workload with good locality of reference. Successive point lookups in such a workload are likely to repeatedly access the same SST files and index/data blocks. For such workloads, MultiGet provides the following optimizations -

When options.cache_index_and_filter_block=true is set, filter and index blocks for an SST file are fetched from the block cache on each key lookup. On a system with many threads performing reads, this results in significant lock contention on the LRU mutex. MultiGet looks up the filter and index block in the block cache only once for a whole batch of keys overlapping an SST file key range, thus drastically reducing the LRU mutex contention.
In steps 1, 2 and 3c, CPU cache misses occur due to bloom filter probes. Assuming a database with 6 levels and most keys being found in the bottommost level, with an average of 2 L0 files, we will have ~6 cache misses due to filter lookups in SST files. There may be an additional 1-2 cache misses if memtable bloom filters are configured. By batching the lookups at each stage, the filter cache line accesses can be pipelined, thus hiding the cache miss latency.
In a large database, data block reads are highly likely to require IO. This introduces latency. MultiGet has the capability to issue IO requests for multiple data blocks in the same SST file in parallel, thus reducing latency. This depends on support for parallel reads in the same thread from the underlying Env implementation. On Linux, PosixEnv has the capability to do parallel IO for MultiGet() using the IO Uring interface. IO Uring is a new asynchronous IO implementation introduced in the Linux kernel starting from 5.1.

https://github.com/facebook/rocksdb/wiki/MultiGet-Performance

This seems like it fits the use-case of storage queries particularly well. We are both getting lots of keys, and then looking up the data for that key individually

Blocked on:

rust-rocksdb doesn't yet wrap 'MultiGet'. So modifying rust-rocksdb wrapper and then kvdb-rocksdb parity wrapper over rust-rocksdb is required
might be a simpler solution by just using get_iter instead, since Iterator->Next in RocksDB executes in constant time

paritytech / substrate-archive

Speed up Storage Indexing #52

From RocksDB Wiki: