[RPC-Spec-V2] Storage: Add support for storage closest merkle descendant

lexnv commented 1 year ago

Fetch the Merkel value of the provided key when using the closest-descendant-merkle-value param. When the key is not present in the trie, the Merkle value of the closest descendant must be returned.

From spec: https://github.com/paritytech/json-rpc-interface-spec/blob/main/src/api/chainHead_unstable_storage.md

If the type of an item is closest-descendant-merkle-value, then the so-called trie Merkle value of the key can be found in the result. If key doesn't exist in the trie, then the Merkle value of the closest descendant of key (including branch nodes) is provided. If key doesn't have any descendant in the trie, then the result will not contain any relevant item.

// @paritytech/subxt-team

rphmeier commented 1 year ago

(merkle as in Ralph Merkle, not Angela Merkel)

lexnv commented 1 year ago

We could first check if the provided key exists in the trie-db: https://github.com/paritytech/substrate/blob/25d9dd17bf0410682151c2c629fe538287fbc8f0/primitives/state-machine/src/backend.rs#L208-L220

If the key does not exist in storage, we'd need to iterate over the next key available:

https://github.com/paritytech/substrate/blob/25d9dd17bf0410682151c2c629fe538287fbc8f0/primitives/state-machine/src/backend.rs#L222-L230

And finally, we can fetch the proof by providing the key found above

https://github.com/paritytech/substrate/blob/25d9dd17bf0410682151c2c629fe538287fbc8f0/client/api/src/proof_provider.rs#L27-L41

I do have a few questions about this approach since I don't have much context wrt our trie-db implementation:

Is the provided StorageProof the merkle value we wish to return?
Would iterating over the keys vianext_storage_key produce the closest descendant?
Could we eliminate the calls to exists_storage and rely only on read_proof? I believe using the read_proof directly with our key would have a similar effect here. The exists_storage method uses storage_hash under the hood to check if the database returns some value. While the read_proof uses storage, I do wonder if there is a performance hit on using the read_proof directly?

Would love to get some feedback on this 🙏 (@cheme @arkpar @bkchr)

cheme commented 1 year ago

I got no idea what this closest-descendant-merkle-value is meant for. If it is to do query for a given node (not all trie storage design can do that), it would also require the actual depth of the merkle value, otherwhise we cannot get correct key for content bellow. Really not sure why it is needed (kind of mix technical content with storage content).

But to reply:

Is the provided StorageProof the merkle value we wish to return?

The merkle value is the hash of a merkle node (branch, or leaf), no idea if it can include inline nodes though: probably can return the sized encoded node in this case, but depending on the purpose, maybe it is the parent hash that we want???

So the storage proof does include this info, but this is not really how one should access it: producing the storage proof is costy and the info is just something we query while accessing the trie. So this needs to be implemented in the trie crate. Note that without changing the trie crate, one can use triedbnodeiterator to get it: the iterator return all trie nodes so we can get the hash of the node from the parent node after doing a seek operation. The issue is that if we seek the key and the hash is at the key then we can only return the children node (and the hash is from the parent), so I guess it may be easier to do it at trie level. But since I don't know why it is usefull, cannot say if it is worth implementing.

Would iterating over the keys via next_storage_key produce the closest descendant?

Next storage only access storage content, since it is a storage rpc, it seems appropriate, but merkle value is an info that we fetch at trie level.

Note that in practice, using triedb iterator on the block state is more efficient than using next_storage to access content.

But generally our subscription are running over the change of storage value seen after each block processing. These changes are pure storage change (no trie level info).

Getting change in merkle value for subscription (done efficiently, I mean not querying at each block) would mean having it plugged on the trie root processing which is quite not easy to do (we can read the delta of all trie node change but these do not have structural information so no way to say which is the right merkle value unless rebuilding the trie and accessing it which is not as bad as doing query on state but still an unneeded overhead).

So I don't have a good solution (costy or breaking badly the architecture by leaking rpc subscription into block processsing). Ok correct solution would be to attach the key (prefix that we have currently in storage but that I wish we could remove) to the trie delta and pass the trie delta to the rpc subscription layer, as we currrently do with the storage delta).

link to the storage change currently accessible from the rpc subscription: https://github.com/paritytech/substrate/blob/74ae05142849bdf2a816df38e80f419741b10dc6/client/db/src/lib.rs#L822C19-L822C36
the one with the trie content: https://github.com/paritytech/substrate/blob/74ae05142849bdf2a816df38e80f419741b10dc6/client/db/src/lib.rs#L821 Then in the memorydb the key matching can be extract by knowing the memorydb key is "key node" ++ "MerkleValue". But as I mention, we want to get rid of this construction (needed by rocksdb but not by paritydb). And generally passing trie change to rpc subscription really need to have a meaning (it is a clone).

Could we eliminate the calls to exists_storage and rely only on read_proof?

if it is just for query, I would say yes, but I am not sure if a read_proof is needed here, and if IIUC the target is subscription, so there should be no call to exists_storage (just compare subscribed item against each block storage changes).

The exists_storage method uses storage_hash under the hood to check if the database returns some value. While the read_proof uses storage, I do wonder if there is a performance hit on using the read_proof directly?

Using the read proof is better (we access trie content in memory when storage_hash may hit the db), but producing the read proof is more costy in the first place.

cheme commented 1 year ago

Thinking of closest-descendant-merkle-value, it can be use to know if any change happens bellow a given prefix key: if hash did change (and without returning all the changes). If it is the primary use, it would be easier to just return a boolean indicating something did change and directly doable with current subscription code.

(but maybe there is the intention to directly query content from the return merkle value, but this is something that limit how to store the trie (currently we can somehow do it but some future design would break this possibility, but would just mean accessing the key from root which may not be that bad (but then just a boolean is enough)).

bkchr commented 1 year ago

I got no idea what this closest-descendant-merkle-value is meant for.

https://github.com/paritytech/json-rpc-interface-spec/pull/37#discussion_r1205331883 maybe this helps

cheme commented 1 year ago

Thanks, got my replies: https://github.com/paritytech/json-rpc-interface-spec/pull/37#discussion_r1205371379 for inline value consideration and more importantly (from issue description):

The closest-ancestor-merkle-value query type makes it possible to know when the content of a map has changed. It can be seen as similar to hash, except that hash is only the hash of the value while closest-ancestor-merkle-value is the hash of the value plus the hash of all the descendants in the trie. Basically, in order to watch a map, a JSON-RPC client would periodically query the closest-ancestor-merkle-value of the map, then redownload the list of keys if this Merkle value changes

So if the intention is to check for a change, then I think we should just return event with key indicating a change did occur at or under the key. This would make things easier to implement (in the subscription from substrate we just check if one of the changed value from the block starts with this key) and simplier (returning a hash at a different depth than the one queried should be attached with its depth to be potentially used).

jsdw commented 1 year ago

@tomaka, just pinging you as this conversation is relevant re the closest-descendant-merkle-value impl/shape ^

tomaka commented 1 year ago

The closest-descendant-merkle-value isn't meant to be interpreted by the JSON-RPC client (after all, it's a hash most of the time). JSON-RPC clients are supposed to treat it as an opaque value.

I can't just say "an opaque value" in the specification, as the JSON-RPC client might want to disconnect and reconnect and still have the same value. In other words, the meaning of this opaque value had to be defined so that all server implementations return the same.

lexnv commented 1 year ago

Considering the above it would be simpler from the substrate point of view to have an event-based notification regarding storage changes (similar what we have today).

One downside of this would be indeed that clients may disconnect and reconnect. In this window of time, the storage might change without the client knowing it. That could be mitigated by the user fetching-refetching the hash of the value from storage on a fresh boot/disconnection.

I'm not sure how expensive that might be from the light-client perspective. @tomaka would love to hear your thoughts about the event-based storage changes 🙏

tomaka commented 1 year ago

If you're talking about changes to the JSON-RPC API, please let's not discuss this in the Substrate repo. The reason why there's a JSON-RPC API repo is specifically so that everyone can follow discussions, rather than look around for random bits of discussions in Substrate issues like happens often.

lexnv commented 1 year ago

From the spec conversation here:

events might not be communicated by substrate or they might arrive with significant delay or smoldot might be overwhelmed
light-client might disconnect/reconnect and miss the storage-changed notification

This makes me lean towards implementing a hashing ("merkle value hash"), but as mentioned above it might be tricky to implement.

It is my understanding that the https://github.com/paritytech/parity-db/issues/199 will add support for MultiTrees and each individual Trie will have a dedicated RootKey (separate from the current single root key and child keys).

That would imply:

the closest-descendant-merkle-value would be local to the MultiTree the key identifies
chainHead_storage parameter childTrie would be renamed to trieKey to identify the multi trie

childTrie: null for main storage look-ups, or a string containing the hexadecimal-encoded key of the child trie of the "default" namespace.

The following change could keep the RPC spec agnostic of the multi-tree implementation:

trieKey: A hexadecimal-encoded string representing the root key of the trie for which storage look-ups are performed

That makes me wonder:

would we still be able to identify nodes of a multiTrie with prefixed key (root key ++ [node address u64])?
could the trie delta approach described above be extended for the multitrie?

does the NewNode'value of the multi trie identify the merkle hash or its something we could add on that structure?

struct NewNode {
data: Vec<u8>,

// Add a new hash field updated to the top of the trie with every storage change
merkle_value: Hash,

children: Vec<NodeRef>, // May reference the node in the same commit or an existing node.
}

Would love to hear your thoughts on this 🙏

cheme commented 1 year ago

I realize I did not think about:

events might not be communicated by substrate or they might arrive with significant delay or smoldot might be overwhelmed light-client might disconnect/reconnect and miss the storage-changed notification

thus I did propose to just emit a boolean event when something under the prefix did change, but that do not cover these two cases.

But if the light client is missing events or disconnecting would querying a storage proof at disconnected block and a storage proof at reconnected block be enough. I mean the second one would be needed to observe the change that we miss and the first one might end up being less costly than multiple 32 byte hash update notifications.

would we still be able to identify nodes of a multiTrie with prefixed key (root key ++ [node address u64])? could the trie delta approach described above be extended for the multitrie?

I don't remember too well all the design. cc @arkpar (to sum-up: the change notification is only looking at key value change of each imported blocks for notification, but we could also pass the trie nodes block modificatio and use the fact that the key is build from "trienodekeyprefix"++"trienodehash" to return the new trienodehash on change for the first trienodekeyprefix of a prefix we listen at).

But removing this key prefixing scheme is something that would be good for the trie crate (requires removing deps on rocksdb first).

arkpar commented 1 year ago

MultiTrees feature in parity-db does not really change anything regarding three traversal. It just optimizes the underlying node storage and makes it aware of the tree structure.

For the closest-descendant-merkle-value it looks like we'd need to support it in trie_db::TrieDB first. Geting the node by prefix is not realy feasible in parity-db as it does not use prefixes at all. And for child keys it gets even more complicated. So traversing the tree looks like the most straightforward solution here.

paritytech / substrate

[RPC-Spec-V2] Storage: Add support for storage closest merkle descendant #14550