turbofish-org / merk

High-performance Merkle key/value store
Apache License 2.0
226 stars 36 forks source link

merk.commit doesn't delete a value from storage #14

Closed davebryson closed 5 years ago

davebryson commented 5 years ago

This is based on the master branch.

It appears that a TreeOp::Delete will remove a node from the tree, but doesn't actually delete the node from storage on commit

For example, this test should pass, but fails on the last 2 asserts:

    let db = Arc::new(TemporaryDB::new());
    let mut tree = Merk::new(db.clone()).expect("Merk tree");
    let mut batch: Vec<TreeBatchEntry> = vec![
        (b"key", TreeOp::Put(b"value".to_vec())),
        (b"key2", TreeOp::Put(b"value2".to_vec())),
        (b"key3", TreeOp::Put(b"value3".to_vec())),
    ];

    tree.apply(&mut batch).unwrap();
    assert!(tree.get(b"key2").is_ok()); // Key2 is there as it should be

    // Delete key2
    let mut batch2: Vec<TreeBatchEntry> = vec![(b"key2", TreeOp::Delete)];
    tree.apply(&mut batch2).unwrap();

    // You'd expect 'key2' to be gone, but this fails - it's still there
    assert!(tree.get(b"key2").is_err());
    assert_eq!(None, db.get(b"key2"))

merk.commit doesn't db.delete() it only adds or overwrites (put) values in the DB from tree.modified().

A possible side effect of this (haven't tested) is that a series of deletes may not change the root hash

All the delete related unit tests seem to miss this case.

Maybe tree.modified should include the TreeOps vs values?

mappum commented 5 years ago

This shouldn't change the hash of the tree, but does use unnecessary disk space.

However, this won't be an issue after the rewrite happening in op-refactor, where we'll coalesce all the deleted keys into a Vec<Vec<u8>> then delete them when committing (or maybe just get them from the Batch, whichever seems to be faster).