Open Ben-PH opened 1 year ago
Here is an e.g. prototype I started
// TODO: Make this more in line with the rocks-db interface
// Eg, have a HashMap<Address, 3-tuple>, options, etc...
pub struct BatchGetter<'a> {
address: &'a Address,
balance: bool,
bytecode: bool,
datastore: Option<Vec<u8>>,
}
impl<'a> BatchGetter<'a> {
pub fn new(address: &'a Address, ty: LedgerSubEntry) -> Self {
let res = Self {
address,
balance: Default::default(),
bytecode: Default::default(),
datastore: Default::default(),
};
res.with_sub_entry(ty)
}
pub fn with_sub_entry(mut self, ty: LedgerSubEntry) -> Self {
match ty {
LedgerSubEntry::Balance => self.balance = true,
LedgerSubEntry::Bytecode => self.bytecode = true,
LedgerSubEntry::Datastore(hash) => self.datastore = Some(hash),
};
self
}
pub fn with_balance(mut self) -> Self {
self.balance = true;
self
}
pub fn with_bytecode(mut self) -> Self {
self.bytecode = true;
self
}
pub fn with_datastore(mut self, hash: Vec<u8>) -> Self {
self.datastore = Some(hash);
self
}
}
@Eitu33 What are your thoughts?
@Ben-PH Batching is nice but using RocksDB iterators would be better here. If we take get_datastore_keys
for example:
/// Get every key of the datastore for a given address.
///
/// # Returns
/// A `BTreeSet` of the datastore keys
pub fn get_datastore_keys(&self, addr: &Address) -> BTreeSet<Vec<u8>> {
let handle = self.db.cf_handle(LEDGER_CF).expect(CF_ERROR);
let mut opt = ReadOptions::default();
opt.set_iterate_upper_bound(end_prefix(data_prefix!(addr)).unwrap());
self.db
.iterator_cf_opt(
handle,
opt,
IteratorMode::From(data_prefix!(addr), Direction::Forward),
)
.flatten()
.map(|(key, _)| key.split_at(ADDRESS_SIZE_BYTES + 1).1.to_vec())
.collect()
}
This function creates an iterator over all the keys that start with the datastore prefix of the given address. A datastore key being addr + datastore_identifier_byte + key
the datastore prefix is addr + datastore_identifier_byte
. Balance and bytecode follow the same convention in a simpler way since they represent only one disk ledger entry, which results in their key being addr + balance_identifier_byte
and addr + bytecode_identifier_byte + key
.
What I would imagine is a function that retrieves all the sub entries of an address by creating an iterator with the range [addr..end_prefix(addr)]
.
Also something I noticed a while back is that RocksDB rust crate has the possibility to specify the range directly with set_iterate_range
, which is better than what was made at the time (see code snippet above). After using this option we would just have to set the IteratorMode
to Start
since the instantiated iterator will only cover the address entry.
Is your feature request related to a problem? Please describe. There are some parts of the code that do seperate read operations on the ledger DB where they could possibly batch the reads.
There could also be a benefit from pinning the data, reducing memory allocations.
Without having benchmarks, it's tough to say how valuable time spent doing this is.
Describe the solution you'd like One example is this code snippet:
could instead be written something like this:
Additional context Add any other context or screenshots about the feature request here.