turbofish-org / merk

High-performance Merkle key/value store
Apache License 2.0
226 stars 36 forks source link

Polish top-level API #19

Open mappum opened 5 years ago

mappum commented 5 years ago

Not a lot of work has been put into making an ideal top-level API yet, so for a first release I'm just targetting one that's good enough, expecting to break it in a future release.

mappum commented 5 years ago

Relevant comment from @davebryson in #15:

One concern I have is opening rocksdb within Merk. This limits the ability to share the db connection across threads. For example, you may want that db handle available separately in the abci commit callback to save the latest root hash and height to the db for later use in abci info.

davebryson commented 5 years ago

Why is apply_unchecked marked as unsafe? Is this just to alert the user that Batch should be sorted and contain unique keys? If so, why not replace Batch with BtreeMap and remove the additional checks in apply?

mappum commented 5 years ago

Why is apply_unchecked marked as unsafe? Is this just to alert the user that Batch should be sorted and contain unique keys?

Yep, that's correct. If they aren't sorted/unique then the operations will cause UB.

If so, why not replace Batch with BtreeMap and remove the additional checks in apply?

The tree algorithms need to be able to binary search, and efficiently split the batch into two. This is theoretically possible with a b-tree structure although I don't believe it's possible with the built in BTreeMap.

davebryson commented 5 years ago

First thoughts on the signatures for the Merk top-level API:

/// Create / Open
/// Where Database is a trait for storage
pub fn new(db: Arc<dyn Database>) -> Result<Merk> {}

/// Return the root hash
pub fn root_hash(&self) -> Option<Vec<u8>> {}

/// Get a value
pub fn get(&self, key: &[u8]) -> Result<Vec<u8>> {}

/// Visit key/values for a given range...
pub fn iter_range<F: FnMut(Node)>(&self, start: &[u8], end: &[u8], f: &mut F) -> Result<()> {}

/// Apply a batch of Put/Deletes, checking for sort and unique keys
/// I wonder if 'merge' may be better?
pub fn apply(&mut self, batch: &Batch) -> Result<()> {}

/// Apply a batch with no check
pub unsafe fn apply_unchecked(&mut self, batch: &Batch) -> Result<()> {}

/// Prove checking...
pub fn prove(&mut self, query: &[Vec<u8>]) -> Result<Vec<u8>> {}

/// Prove unchecked
pub unsafe fn prove_unchecked(&mut self, query: &[Vec<u8>]) -> Result<Vec<u8>> {}
mappum commented 5 years ago

why not replace Batch with BtreeMap and remove the additional checks in apply

We could possibly make an API for building batches with something like:

db.batch()
  .put(vec![1], vec![1])
  .put(vec![2], vec![2])
  .delete(vec![3], vec![3])
  .commit()?;

This could use a BTreeMap internally to ensure the resulting batch is sorted/unique, and call apply_unchecked.

davebryson commented 5 years ago

Yes. In fact that's what I did with an example app. It's basically a cache wrapper around Merk that uses BTreeMap.

/// On Commit 

/// BtreeMap => Batch
/// working_cache is a BTreeMap.   StoreOp is used by BTreeMap to track dirty values
let mut batch: Vec<(&[u8], TreeOp)> = self
.working_cache
.iter()
.filter_map(|(k, storeop)| match storeop {
   // Only update dirty 'puts'
   StoreOp::Put(true, value) => Some((&k[..], TreeOp::Put(value.to_vec()))),
   StoreOp::Delete => Some((&k[..], TreeOp::Delete)),
   _ => None,
}).collect();

self.state.apply_unchecked(&mut batch).expect("Merk Commit:");