Provide storage read method for enumerator return

hal0x2328 commented 5 years ago

Returning an enumerator type is a powerful way to allow developers to filter results of a query of a very large array, but as a standalone return value, it lacks the ability to select and include related data, leading to more complex contracts with dozens of custom operations, or inefficient queries requiring multiple requests and large amounts of unnecessary data transfer.

For example, if I have an NFT contract with 100K tokens each representing a unique physical property deed, and I want to query the contract for all deeds in a certain zip code, I have to write an internal contract function to return that result, one for each possible permutation of other query selectors (e.g. property acreage, building type, etc). Either that or return the entire list of tokens IDs and query their properties one at a time which is wildly inefficient.

But, if I could simply call the tokens or tokensOf methods and get an enumerator, while iterating over the set I could request related data (e.g. token.properties.zipcode) in the same contract (or even an external contract) and use it as a selector, returning just the desired result set - but only if I can make a storage read while running outside of the contract, e.g.

Storage.Read(scripthash, key);

or

ctx = GetReadOnlyContext(scripthash);`
Storage.Get(ctx, key);

or some similar method to get read-only access to a specific contract's storage.

igormcoelho commented 5 years ago

From my understanding of NFT discussion, I think that returning the enumerator is already the wisest choice, like you guys pointed out. In terms of data filtering, I think it's more related to the Enumerator than to Storage itself. If you want to process data off-chain, that's easy, you get all data displayed by RPC node when "deserializing" the returned enumerator. So you could return the whole thing, and filter it off-chain. If you want to filter data on the contract, perhaps it would be the case to have another interop Enumerator.Filter, that receives some sort of "lambda function", and applies over your enumerator, returning another enumerator. Could you clarify if you need off-chain or on-chain processing Hal?

hal0x2328 commented 5 years ago

In terms of data filtering, I think it's more related to the Enumerator than to Storage itself. If you want to process data off-chain, that's easy, you get all data displayed by RPC node when "deserializing" the returned enumerator. So you could return the whole thing, and filter it off-chain.

The problem is when the data you want isn't contained in that enumerator but instead in other related storage keys. From my interpretation the tokens operation in NEP-11 just returns token IDs and nothing else. So you could write another function like tokensProperties to return an enumerator containing all the properties of all tokens, but what if the properties you want to use as a query selector are in an external contract?

If you want to filter data on the contract, perhaps it would be the case to have another interop Enumerator.Filter, that receives some sort of "lambda function", and applies over your enumerator, returning another enumerator.

That is an interesting idea but I don't know how practical it is for all the currently supported languages.

Could you clarify if you need off-chain or on-chain processing Hal?

It would be useful either way. It would all be running in the VM for maximum efficiency, regardless if it was done inside an appcall from another contract or a post-appcall function.

shargon commented 5 years ago

With the last json feature, it could be easier if you store your data in a json format

{
"balance":0,
"lastUpdate":"2019-01-01'
}

We could have a syscall like this....

var iterator = Storage.FindJson(u => u.balance > 0);

igormcoelho commented 4 years ago

@hal0x2328 I think that this direction may be possible: https://github.com/neo-project/neo/issues/914

If we create some delegate function, to pass this as a lambda, and also provide a ReadOnlyStorageContext... user may possibly to any crazy things he wants. Yesterday I talked to a really clever guy from language design field, and he emphasized to me that these kind of iterators shouldn't update data.. otherwise you could create some infinite loops. Let's see how this evolves.

lock9 commented 4 years ago

@igormcoelho is this something we can start working on? @hal0x2328 how important is this for your dApp? In a scale 0 to 10?

hal0x2328 commented 4 years ago

It's not critical for my dApp because I use a neo-cli plugin providing a custom RPC method to iterate over storage keys. But it could be very useful in the future for devs wanting to use the stock neo-cli RPC service.

lock9 commented 4 years ago

So what you mean is that "it is important" but you found another way to do it?

hal0x2328 commented 4 years ago

Exactly.

igormcoelho commented 4 years ago

It's probably good for onchain usages of NFT tokens. And for other things that keep arising... looks promising.

lock9 commented 4 years ago

@neo-project/core should we include this in the roadmap? This seems important. I'm not sure about "performance impacts". Not sure if we should add stuff that can be done without smart contracts, to the blockchain. It seems a good feature, let's evaluate the side effects.

shargon commented 4 years ago

is a good feature, but it must have a cost acording to the performance impact

lock9 commented 4 years ago

This is the same case as the json serialization / deserialization. We can either add it to NEO and make users pay for this feature, or we have them using offchain tools.

It makes more sense, as a product, that this is added in neo and people pay for this usage.

lock9 commented 4 years ago

@hal0x2328 is it important for you to know transactions that are happening in the same block? Let's say you are running 100 calls to your smart contract, does the order of these transactions impact the final result?

hal0x2328 commented 4 years ago

@hal0x2328 is it important for you to know transactions that are happening in the same block? Let's say you are running 100 calls to your smart contract, does the order of these transactions impact the final result?

The post-invocation code which is iterating over the enumerator is only reading the storage state of the node at that point in time, so the results could be different if the same invocation is ran again later, the same as if you had just returned an array on the stack. This is simply about being able to filter the results and "join" values from other storage keys.

igormcoelho commented 4 years ago

Hal, offchain usage is enough for you? I agree, we would need an immutable storage snapshot (or a clone) to iterate over. A MPT would help too, as it is time synced.

Could you put (offchain) label on this title? So we discuss offchain solutions here, and onchain on the other thread.

hal0x2328 commented 4 years ago

I'm not sure why an immutable storage snapshot is needed, it's already possible to get a read-only storage context object in the VM. The iterator is still running in the VM, it's just that the currently executing script value has transitioned from the smart contract back to the entry script post-appcall so we aren't able to request the storage context at this point - but it's still in memory.

erikzhang commented 4 years ago

var entries = Storage.Find("Hello");
return entries.Where(p => p.Length < 10);

We need to implement something like this.

lock9 commented 4 years ago

@erikzhang what does "Storage.Find" does? What is the difference between "Find" and "Get"? It is used for prefixes only? If we have a "Potato" key, Find("P") will return "Potato" while Get("P") returns null?

igormcoelho commented 4 years ago

@hal0x2328 that's the whole point of dividing discussion in offchain an onchain. For offchain, an immutable storage snapshot is needed, since it may change during operation. For onchain, a readonlystorage is desired to prevent updates during Where processing (and no snapshot is needed, you're right, since we are all on same block).

A helpful thing to implement this is like Shargon said, a Json parser, so we can apply this Where on a implicit conversion to json from the object, applying some delegated filtering... let's keep discussing, things will be clearer.

shargon commented 4 years ago

If we have a "Potato" key, Find("P") will return "Potato" while Get("P") returns null?

Yes @lock9

hal0x2328 commented 4 years ago

@hal0x2328 that's the whole point of dividing discussion in offchain an onchain. For offchain, an immutable storage snapshot is needed, since it may change during operation.

Ok, maybe that's where I'm getting lost. How would you iterate over an enumerator returned by a smart contract offchain? Is this related to some Neo 3.0 code I haven't seen yet?

steven1227 commented 4 years ago

var entries = Storage.Find("Hello");
return entries.Where(p => p.Length < 10);
We need to implement something like this.

Is this feature going to be implemented ? @erikzhang

erikzhang commented 4 years ago

Is this feature going to be implemented?

Yes, if it is feasible.

igormcoelho commented 4 years ago

Guys, pehaps we have a solution (a safe one)! Let's try to discuss here a little bit: https://github.com/neo-project/neo-vm/issues/190

igormcoelho commented 4 years ago

@erikzhang I'll reopen this. Finally doable with https://github.com/neo-project/neo-vm/issues/190

neo-project / neo

Provide storage read method for enumerator return #880