f-o-a-m / kepler

A Haskell framework that facilitates writing ABCI applications
https://kepler.dev
Apache License 2.0
34 stars 10 forks source link

Iteration in Store Map #254

Open UnitylChaos opened 3 years ago

UnitylChaos commented 3 years ago

There is currently no way to get all the keys or values from a Store Map. A simple workaround is to use both a Map and a Var, to track the set of keys used in the Map. This is rather inefficient, both for the extra storage and since extracting all the values from the map requires repeated lookup operations. Plus the extra overhead/mistake potential of keeping the key Set synced with the Map.

It should be possible to directly support this. IAVL is designed specifically with this use case in mind, keys are stored sequentially and there are operations to iterate over all keys/values in a tree or within a key range. This behavior is exposed through the gRPC layer with the List procedure.

I would propose adding methods such as: keys and toMap/toList to the Map Store interface which would need to be backed by an iterate method in the RawStore/IAVLStore to get all keys/values under a particular prefix/substore.

I think this could also be used to improve the efficiency of operations like foldl in List and Array since they are currently using repeated lookups to access sequentially stored values.

martyall commented 3 years ago

I'm 100% in favor of this, I actually have a branch update-iavl-client to bring the client library up to speed with the current grpc api offered by iavl. I failed because of some dumb ci reasons involving docker that i never figured out, but also because the iavl client tests kind of suck and should probably be rewritten. I think the first step is to do this so we can even get the api we need to make the call.

A word about storage prefixes. As far as I can tell, we are doing the same thing that cosmos does (if they do anything consistent at all). This means that keys are simply prefixed several times. For example the an account in the auth module has two prefixes:

then the account itself. This means that for example the account 0xdeaf is actually stored as:

"auth" <> "accountsMap" <> 0xdeaf, where everything is converted to Bytestring in a consistent way, e.g. unicode, RawKey class, etc.

I'm assuming that for example, in order to get all of the accounts, you could do something like List from

"auth" <> "accountsMap" <> 0x00..00 to "auth" <> "accountsMap" <> 0xff..ff ?

UnitylChaos commented 3 years ago

That is also my understanding of how prefixing works. There is a little bit of a concern about how to specify the entire range though. In your example it works because it's a fixed length bytestring, so you can specify the last one, but for more general string keys I think we would have to somehow figure out / use the next prefix as the end point. Ie List from "auth" <> "accountsMap" to "bank" or, the next store within "auth" if one existed. This seems like a pretty bad way to do it...

I'm now realizing there may also be an issue with name conflicts between modules and their stores though. For example if you had a module named "authaccounts" with a store called "Map" it would end up sharing a prefix. I had sorta assumed actual prefixed keys were constructed with "/" or some sort of separator which wasn't allowed in the names. But I guess being bytestrings that isn't really possible...

To solve the first problem I had been trying to think through a way to construct an "endcap" which could be a key which came after any of the keys within the store. ie: "auth" <> "accountsMap" <> 0x00...00 "auth" <> "accountsMap" <> 0x00...01 ... "auth" <> "accountsMapENDCAP"

But there is no such value. Since a key of all 0xff is supposed to be valid, there's no ENDCAP value which would come after that (except a longer key of all 0xff, but that only solves the problem for fixed sized keys again)

If user created keys were actually strictly strings instead of bytestrings, then you could construct a separator (like 0x00) and also an endcap (0xff) which would solve both these problems...

I'm not really clear on how Cosmos-SDK currently handles iterating entire substores / prefix ranges, I'll try to look into that.

martyall commented 3 years ago

I'm now realizing there may also be an issue with name conflicts between modules and their stores though. For example if you had a module named "authaccounts" with a store called "Map" it would end up sharing a prefix.

I think for the moment we can stick to "don't do anything dumb", which is surely what the cosmos sdk is doing. I can imagine in the future a runtime check for conflicting prefixes, but I don't think it's something thats worth trying to control at the type level.

It's worth asking the IAVL people if you could just get all the keys that begin with a certain prefix, but I don't know enough myself to say if that's possible

UnitylChaos commented 3 years ago

Okay, so what the SDK does seems reasonable to me.

https://github.com/cosmos/cosmos-sdk/blob/7fc5e3e6ab71fc6643176773e301d87b729ce549/store/types/utils.go#L68-L91

It just increments the last character by 1, so "abcd" -> "abce". Unless it ends with 0xFF, in which case it repeatedly drops the last char until it isn't 0xFF, and increments whatever remains, so "abcdFFFF" -> "abcde". If it runs out of characters, ie the prefix is 0xFF..FF then the next key would just be an empty string.