warp-contracts / warp

An implementation of the Arweave SmartWeave smart contracts protocol.
MIT License
159 stars 44 forks source link

feat: Add functions to the Kv API in order to query ranges of keys and/or values #345

Closed noomly closed 1 year ago

noomly commented 1 year ago

Exposing the keys and the values functions that leveldb provide would already be very useful in Kv. I imagine it'd not be too hard to write an equivalent feature for other key-value stores (like lmdb for example) that could fit into the same API.

Exposing the iterator api would be extremely useful for kv databases that are organized in a hierarchical way. Take this key structure for example:

settings.contractName
settings.operators
tokens.{}.ticker
tokens.{}.balances.{}

{} being any kind of string, here ids or addresses. In this example, it might be useful to get every balances for a specific token in order to calculate its current supply. This would be achieved pretty easily with the values function, like so:

const tokenPath = `tokens.aRandomTokenId.balances.`;
const supply = (
    await db
        .values({
            gte: tokenPath,
            lte: `${tokenPath}\xff`,
        })
        .all()
).reduce((acc, value) => acc + parseInt(value), 0);
ppedziwiatr commented 1 year ago

I guess the lmdb-js has similar APIs (https://github.com/kriszyp/lmdb-js#dbgetvalueskey-options-rangeoptions-iterableany , https://github.com/kriszyp/lmdb-js#dbgetkeysoptions-rangeoptions-iterableany ) so it should be doable...

@Tadeuchi

Tadeuchi commented 1 year ago

Hey @noomly, I'm trying to implement the range queries for kv storage. I created some draft and I was wondering if you could take a look and let me know if this is something that you had in mind.

You can find my PR above, and also some simple interaction example that uses the range options in kvstorage https://github.com/warp-contracts/warp/pull/364/files#diff-b89ef4e5c5624ccac706296f669964e1b2f31b2a66dd49537ebffcd11780610f

noomly commented 1 year ago

Hey @Tadeuchi, the keys and entries method are exactly like what I had in mind, they should be enough to get started. Thank you!

I'm just a bit curious about the usage of the dbSeparator prepended with the sortKey, are different versions of the db stored inside a single KV for a contract?

ppedziwiatr commented 1 year ago

Hey @Tadeuchi, the keys and entries method are exactly like what I had in mind, they should be enough to get started. Thank you!

I'm just a bit curious about the usage of the dbSeparator prepended with the sortKey, are different versions of the db stored inside a single KV for a contract?

In case of a LevelDB - each contract has its own db file, each 'business' key in contract is modeled as a separate sublevel (https://github.com/Level/level#sublevel--dbsublevelname-options), each sublevel stores a map sortKey -> value.

Tadeuchi commented 1 year ago

Unfortunately including keys and values from undergoing interaction in range queries is not that simple… I’ll explain the whole issue.

Since AbstractLevel does not have transactions we created something of our own that simulates transaction behavior.

During interaction evaluation all the kv operations are not actually stored in the kvStorage, but rather they are aggregated inside a single batch. If the interaction is successful we execute the batch and update entries. If something goes wrong we discard all the operations in the batch.

Example: Let’s say in the first interaction we save keys A and B. This interaction is successful, the batch is executed and the kv storage contains keys: A and B. During the next interaction we save key C (it actuallys just goes to the batch), then we try to iterate through all the keys. Kv storage has only two keys so we have to combine the abstract level range functionality with our batch in order to return all the keys… This is very problematic, especially if we consider all the options from the range query like limit, reverse, gte etc.

So now we have two scenarios…

  1. Range functionality will only contain keys already stored in the kvStorage, i.e. for fully evaluated interactions. Range functionality will not contain keys stored during currently evaluated interaction.
  2. We tried a different approach. Auto committing all the keys and values and keeping some registry of operations performed by undergoing interaction. If something goes wrong we use the registry to delete the current interaction entries. This approach allows use of range based operations for all keys and values, including currently used. Unfortunately the rollback is less secure than in the previous case.

Obviously both scenarios have drawbacks. @noomly maybe you have some thoughts? Would range operation only for committed keys be enough?

noomly commented 1 year ago

To summarize, Warp has to handle transaction support by itself which it does by having an internal batch representation. If I remember well Warp currently allows to get non-committed keys/values via this internal batch representation via simple gets but the issue you're raising is it's going to be too difficult/inefficient to support it for the more complex keys and entries queries. Did I get this right?

I didn't foresee this issue unfortunately. For the first scenario, we could make the point that a running interaction could maintain a local state and not require that uncommitted keys/values has to be available through keys/entries queries. Although there would be many issues with this approach; after an inter-contract call, if the caller needs to fetch the updated state of the callee, it would necessarily have to use get. Also, get giving different results than keys and entries could cause a lot of confusion to newcomers.

As for the second scenario, I understand the security implications and it might not be the most efficient either. It's unfortunate that LevelDB doesn't support transactions out of the box... And there doesn't seem to be an obvious choice for a KV database available as a single API both on servers & web environments that provides this feature...

I feel like the tradeoffs coming from the two scenarios are too important and would probably cause too much issues in the future as people start relying on keys/entries. The best solutions would probably be to:

Both of these solutions would require a non-trivial amount of work and maintenance so they might not be conceivable short-term. It's probably best to postpone the implementation of these functions for now as, as I've I stated above, I'm not sure the tradeoffs are worth it in the current state...

ppedziwiatr commented 1 year ago

Turned out to be much more tricky than we've initally expected...anyway, implemented.