turbofish-org / merk

High-performance Merkle key/value store
Apache License 2.0
226 stars 36 forks source link

Range queries #40

Closed mappum closed 3 years ago

mappum commented 4 years ago

The proof format already supports range queries, but there isn't currently a way to use them. We will need the prover to include boundary keys to create a verifiable range proof, then verify the boundaries and ensure there are no missing nodes within the range on the client.

prove and verify_proof take in lists of keys, but we'll replace that with a Query type which contains a combination of unique keys and ranges to be included in the proof, or checked on verify.

One question will be whether verify_proof will still need to know about the list of queried keys (currently required to know which values to return and which keys to check for absence proofs). We could instead just verify the proof against the root hash, then have a type (e.g. Proof) which makes it easier to pull out data (by key or range), can check for arbitrary absence proofs, and can verify arbitrary ranges (either by keeping track of all contiguous KVNode ranges on the initial verify pass, or by going back into the proof data and checking them when a method is called).

mappum commented 3 years ago

This is nearing completion in the query branch.

All that remains is a type for consuming proofs, which is first compared against the root hash, then progressively verified for data as it is accessed (through methods like proof.get(key), proof.range(x..y), etc.), to make it possible to verify proofs without first knowing the exact set of keys/ranges.

ethanfrey commented 3 years ago

I'm curious if you have looked into ics23 compatibility for your proofs

mappum commented 3 years ago

I'm curious if you have looked into ics23 compatibility for your proofs

Just looked into it, is https://github.com/cosmos/ibc/tree/master/spec/core/ics-023-vector-commitments the up-to-date spec for ICS23? I don't see anything that matches the logic in your implementation around inner/leaf specs, operators, etc.

Just from looking at your implementation, I'm not sure it will be trivial to build specs that fit into that current pattern but I feel like I'm missing some context so I'm not sure. But as far as the ICS23 document I linked above, the interface seems easy to connect.

ethanfrey commented 3 years ago

Yeah, the ICS specs are so generic that it is impossible to implement, without making a lot of constraints on the actual functions and types. They seemed to leave all engineering to other teams and the actual formats are never specified anywhere. The only way to figure out the actual ics20 events was to run a chain and query the event logs. Yet these are essential for compatibility.

I would like some feedback on my ics23 implementation. I did attempt to make it generic enough to cover merk, but it was like a year ago that I last touched it (and 18 months ago I wrote most of this). Maybe we can do this later.

(There is no support for range proofs persay, but batches of any set of proofs and a generic compression algorithm. We do have helpers to determine if a batch is a range. But no fancy/custom range compression techniques)