citahub / cita_trie

Rust implementation of the Modified Patricia Tree (aka Trie).
Apache License 2.0
71 stars 28 forks source link

RocksDb implementation #22

Closed davebryson closed 5 years ago

davebryson commented 5 years ago

First thanks for the great work on this! An easier to understand, Rust based merkle tree is much needed!

I've created a simple RocksDb store implementation. You can find it here: cita-trie-db .

Couple thoughts on the current cita-trie implementation:

u2 commented 5 years ago

Nice suggestion!

First thanks for the great work on this! An easier to understand, Rust based merkle tree is much needed!

I've created a simple RocksDb store implementation. You can find it here: cita-trie-db .

Great work!

Couple thoughts on the current cita-trie implementation:

  • Consider renaming the DB Trait to something else (maybe KvDB). This might reduce the chance of it clashing with other crates. For example, I had to alias RocksDB::DB.

Yes. I think either KeyValueDB or KVDB is ok.

  • trie.commit() is private outside the crate. You can call commit via trie.root(), but it might be useful to be more explicit from the API.

Agree, it's better to add more comments to for it.

  • trie.commit() calls self.db.insert/remove. It'd be nice to be able to batch these calls with RockDb. Maybe the DB trait can expose a commit(...) that would allow this.

Agree. We have left some TODO in the code: https://github.com/cryptape/cita-trie/blob/master/src/trie.rs#L387, we will consider it soon.

cryptowen commented 5 years ago

First thanks for the great work on this! An easier to understand, Rust based merkle tree is much needed!

I've created a simple RocksDb store implementation. You can find it here: cita-trie-db .

Couple thoughts on the current cita-trie implementation:

  • Consider renaming the DB Trait to something else (maybe KvDB). This might reduce the chance of it clashing with other crates. For example, I had to alias RocksDB::DB.
  • trie.commit() is private outside the crate. You can call commit via trie.root(), but it might be useful to be more explicit from the API.
  • trie.commit() calls self.db.insert/remove. It'd be nice to be able to batch these calls with RockDb. Maybe the DB trait can expose a commit(...) that would allow this.

Thank you for your great work!

Just reviewed your code, and got one question: why you return an error instead of none in the get function?

https://github.com/davebryson/cita-trie-db/blob/master/src/lib.rs#L63-L69

yejiayu commented 5 years ago

Thank you so much for your suggestion!

Consider renaming the DB Trait to something else (maybe KvDB). This might reduce the chance of it clashing with other crates. For example, I had to alias RocksDB::DB.

I don't think it is necessary to change it. Isn't the alias a solution to this problem?

trie.commit() calls self.db.insert/remove. It'd be nice to be able to batch these calls with RockDb. Maybe the DB trait can expose a commit(...) that would allow this.

My idea about batching is to addbatch_insert andbatch_remove to the DB trait and set a default implementation.

pub trait DB: Send + Sync + Debug {
    type Error: Error;

    fn get(&self, key: &[u8]) -> Result<Option<Vec<u8>>, Self::Error>;
    fn insert(&mut self, key: &[u8], value: &[u8]) -> Result<(), Self::Error>;
    fn contains(&self, key: &[u8]) -> Result<bool, Self::Error>;
    fn remove(&mut self, key: &[u8]) -> Result<(), Self::Error>;

    fn batch_insert(&mut self, keys: &[&[u8]], values: &[&[u8]]) -> Result<(), Self::Error> {
        for i in 0..keys.len() {
            self.insert(keys[i], values[i])?;
        };
        Ok(())
    }

    fn batch_remove(&mut self, keys: &[&[u8]]) -> Result<(), Self::Error> {
        for key in keys {
            self.remove(key)?;
        };
        Ok(())
    }
}

Then I will use batch_xx to replace insert and remove from commit.

Such modifications do not introduce any version compatibility issues, if you want to achieve true batch processing, you can replace the default implementation.

u2 commented 5 years ago

Thank you so much for your suggestion!

Consider renaming the DB Trait to something else (maybe KvDB). This might reduce the chance of it clashing with other crates. For example, I had to alias RocksDB::DB.

I don't think it is necessary to change it. Isn't the alias a solution to this problem?

trie.commit() calls self.db.insert/remove. It'd be nice to be able to batch these calls with RockDb. Maybe the DB trait can expose a commit(...) that would allow this.

My idea about batching is to addbatch_insert andbatch_remove to the DB trait and set a default implementation.

pub trait DB: Send + Sync + Debug {
    type Error: Error;

    fn get(&self, key: &[u8]) -> Result<Option<Vec<u8>>, Self::Error>;
    fn insert(&mut self, key: &[u8], value: &[u8]) -> Result<(), Self::Error>;
    fn contains(&self, key: &[u8]) -> Result<bool, Self::Error>;
    fn remove(&mut self, key: &[u8]) -> Result<(), Self::Error>;

    fn batch_insert(&mut self, keys: &[&[u8]], values: &[&[u8]]) -> Result<(), Self::Error> {
        for i in 0..keys.len() {
            self.insert(keys[i], values[i])?;
        };
        Ok(())
    }

    fn batch_remove(&mut self, keys: &[&[u8]]) -> Result<(), Self::Error> {
        for key in keys {
            self.remove(key)?;
        };
        Ok(())
    }
}

Then I will use batch_xx to replace insert and remove from commit.

Such modifications do not introduce any version compatibility issues.

  1. The batch is not batch_insert or batch_delete, it's batch operations, combining with insert/delete.

https://github.com/paritytech/parity-common/blob/master/kvdb/src/lib.rs#L38 https://github.com/nervosnetwork/ckb/blob/develop/db/src/batch.rs

  1. For some KeyValueDB, there is only the batch write interface.

https://github.com/nervosnetwork/rust-rocksdb/blob/master/src/db.rs#L73

  1. We should also support col for the write.
davebryson commented 5 years ago

Yes that sounds like a good approach.

On a side note, do you have plans to add Proof support to the trie in the future?

Thanks again.

yejiayu commented 5 years ago

We should also support col for the write.

We don't just use it for rocksdb, I need to think about it.

yejiayu commented 5 years ago

On a side note, do you have plans to add Proof support to the trie in the future?

If our business needs it.

Before that, we also welcome others to help with this part of the work.

u2 commented 5 years ago

We should also support col for the write.

We don't just use it for rocksdb, I need to think about it.

col is Option type.

u2 commented 5 years ago
  • Consider renaming the DB Trait to something else (maybe KvDB). This might reduce the chance of it clashing with other crates. For example, I had to alias RocksDB::DB.

Yes. I think either KeyValueDB or KVDB is ok.

I reconsider it, and I think we should not change the DB to KeyValueDB or KVDB. The DB under the trie is something like HashMap, it's more abstract. Most of the time, this DB is a memory db. If we want to save it to some real KVDB like rocksdb, we should get all the keys from the DB, and then save it to KVDB. The trait for rocksdb or other kvdbs should be named KeyValueDB, but not for the trie.