FuelLabs / fuel-core

Rust full node implementation of the Fuel v2 protocol.
Other
58.02k stars 2.78k forks source link

Add support for multi-get operation in the Database #2344

Open xgreenx opened 1 week ago

xgreenx commented 1 week ago

Oveview

In several places of the codebase we have cases when we need to get multiple values at once. It is faster to do via multi-get operation.

Also, we want to use it for https://github.com/FuelLabs/fuel-core/issues/2023

Definition of done

use multijet instead of iteration and single value.

Implementation details

We want to add support for it into the fuel_storage::StorageInpect trait and use it inside of fuel-vm and fuel-tx.

netrome commented 2 days ago

I have been looking into the storage traits now and there is a conflict between how I'd ideally implement the multi get method and our use of trait objects.

The problem

I'd like to use iterators for the multi get operation, which will make the interface more flexible and efficient as opposed to taking a slice of key references. I'm imagining something like the following example:

    fn get_multi<'a>(
        &'a self,
        _keys: impl IntoIterator<Item = impl AsRef<Type::Key>>,
    ) -> impl Iterator<Item = Result<Option<Cow<'a, Type::OwnedValue>>, Self::Error>> + 'a
    where
        <Type as Mappable>::OwnedValue: 'a,
        <Self as StorageInspect<Type>>::Error: 'a,
    {
        None.into_iter() // TODO
    }

which I've right now placed in the StorageInspect trait as in the ticket description, but it could be it's own extension trait as well.

However, the use of generics in this method means that the trait is no longer object safe which conflicts with our current usage of it in a lot of places.

Possible solutions

1. Don't use generics. A simple way to work around this issue is to not use iterators but instead restrict the trait to take a slice reference as input and return a Vec as output. This can result in some unnecessary heap allocations and requires some more boilerplate to use the method in most cases.

2. Remove redundant trait objects. Another way forward is to separate the multi get operation to an extension trait, and only use multi get in contexts where we don't use trait objects. For example InterpreterStorage::contract_state_range is one function we might[^1] want to use multi-get where no trait objects currently block this implementation. However, from the graphql API when we want to use this function, we'll ultimately access it through the following types:

/// The on-chain view of the database used by the [`ReadView`] to fetch on-chain data.
pub type OnChainView = Arc<dyn OnChainDatabase>;
/// The off-chain view of the database used by the [`ReadView`] to fetch off-chain data.
pub type OffChainView = Arc<dyn OffChainDatabase>;

If we could make OnChainView, OffChainView into concrete types[^2] instead of trait objects we wouldn't need to require object safety for the multi get operation.

[^1]: It is probably more efficient to use a storage-native iterator here such as https://docs.rs/rocksdb/0.22.0/rocksdb/struct.DBCommon.html#method.iterator_cf_opt [^2]: I've used enums in the past for this. The enum_dispatch crate can be used to reduce some boilerplate when using this pattern, but the pattern is still usable without the crate.

How to proceed

I'm leaning on investigating the feasibility of the second solution. I'd love to receive input if there is any other angle we should consider approaching this from, or any problems with either approach.

netrome commented 2 days ago

Another place where we want to use multi get is in OnChainIterableKeyValueView::get_full_block. Here we face the same problem since OnChainIterableKeyValueView wraps an Arc<dyn IterableStore<Column = Column> + Sync + Send> trait object.

netrome commented 1 day ago

We've discussed this in the team now. Since we have pending refactors for the GraphQL API planned, which are lower priority than getting the multi-get in place, we'll stick with object safe implementations of this functionality. For the object safe implementation, we prefer boxed iterators over slices/vectors.