Open xlc opened 1 year ago
The chainHead_storage method with query type closestDescendantMerkleValue
returns an opaque hash.
Comparing two results of the same method call is used to determine if any storage changes happened below the provided prefix. With this approach, users are responsible of making the storage call, as opposed to the legacy rpc that provides notifications back to the user.
Would you also be interested in the RPC providing the key that was changed / added?
Yeah I still need a way to figure out the added/removed/modified keys.
Implementing this with the current API might be complicated, although not impossible.
I would start by constructing a closestDescendantMerkleValue
query type.
Then, construct a decendantHashes
query type to obtain all the keys under the provided prefix.
Because the current API does not offer support for your use-case at the moment, you'd need to make another closestDescendantMerkleValue
to ensure that keys where not added in the meanwhile.
If the second closestDescendantMerkleValue
is different then the first one, you'd need to repeat this process again.
Whenever a different hash is reported by closestDescendantMerkleValue
, another decendantHashes
query must follow to compare and detect any changed keys.
Even though the API supports batch requests via the items
parameter, the order in which the RPC server handles requests is not imposed by the spec. This leads to at least 3 RPC calls for the initialization routine, then at least one periodic call to verify the merkle-value.
Considering the fact that decendantHashes
queries have support for pagination, you'd also need to drive the responses with the chainHead_continue
method.
This sounds indeed complicated enough and might not be feasible for smaller prefix keys, or prefix keys that have multiple storage entries below them.
Considering the light-client, I don't have enough context, but I would expect this to be even difficult to implement in the server.
I would be interested to hear more about your use-case, since it sounds like a complicated thing to implement ๐ How are you handling this use-case at the moment? Using the legacy APIs?
// for reference, @tomaka might have more insights into this
Here are some of my use cases.
1) Subscribe all the storage changes for new blocks. This is useful for indexer. The legacy trace RPC is currently used for this.
2) Monitor a storage map and perform various checks/tasks. e.g. assert the total issuance is equal to the sum of all the balances. We are currently use events to trigger and re-iterate the whole map for checking. This is super inefficient.
3) Subscribe for a storage map and update UI when a new item is added. Right now the dApp just can't handle it and require users to do a refresh to see the new item.
Here are some of my use cases.
- Subscribe all the storage changes for new blocks. This is useful for indexer. The legacy trace RPC is currently used for this.
- Monitor a storage map and perform various checks/tasks. e.g. assert the total issuance is equal to the sum of all the balances. We are currently use events to trigger and re-iterate the whole map for checking. This is super inefficient.
- Subscribe for a storage map and update UI when a new item is added. Right now the dApp just can't handle it and require users to do a refresh to see the new item.
FWIW I'm working on integrating a new API into the @polkadot-api/client
. This API is designed to monitor changes in a storage map. The goal is to make it easy to automatically update the UI whenever there're additions, deletions, and/or modifications.
However, this task is turning out to be more complex than I first thought.
One important thing to note is that the updates provided by this new API won't necessarily cover every single change as it happens on every finalized block. Ideally, it would inform the user of changes after each finalized block. But, there's no guarantee of this happening every time. This means sometimes the updates might include changes from some previous blocks all at once, rather than just the most recent one. So, if a user gets an update at block 'x', it might include all the changes that happened since block 'x-y'. This limitation makes the feature less suitable for those who need to track every single change in real-time for each finalized block.
Additionally, there's another aspect to consider: the more storage map "watchers" a user sets up, the more they might experience delays in getting updates.
I've thought about introducing an overload in the API that would force the library to evaluate the deltas in every single finalized block. However, this approach has a significant downside. It greatly increases the chance of triggering a 'stop' event.
In conclusion, I find myself agreeing with @xlc's viewpoint: it would be really beneficial to have an API specifically designed to identify deltas for a given "partial trie path". This would be particularly useful even if it could only provide the changes a given pinned block.
Otherwise, I would greatly appreciate some advice on the best approach to querying data changes. In the background, my current process involves the following steps:
Detecting Changes: First, I use closestDescendantMerkleValue
to check if there have been any changes on the storage map.
Identifying Specific Changes: Next, I use descendantsHashes
to figure out which specific entries have been added, deleted, or updated. After identifying these changes, I need to actually retrieve the updated values.
Here's where I face a dilemma: What's the most efficient way to retrieve these values? Should I use descendantValues
to get them all at once, or should I make a query with many value
s in the storage request?
The challenge is that each value
query uses one of the limited 'operation slots' available. On one hand, if the storage map contains many large items, using descendantValues
to fetch everything at once seems excessive and could be a waste of bandwidth. On the other hand, each individual value
query also consume these operation slots.
Currently, Iโm using descendantValues
only to pull the initial list of values and a list of value
items to query the changes using just one request (which takes several operation slots). But I think there's room for optimization here. Perhaps in the future, I could develop a heuristic to decide when itโs more efficient to use descendantValues
.
Like @xlc mentioned, there really should be a simpler and more efficient method to handle this process.
cc: @tomaka ๐
Previous issue https://github.com/paritytech/substrate/issues/5790
I would like to subscribe by key prefix. This will allow me to subscribe new entries in map / double map.