Context management and message batching in ICS26 (routing)

This is not really a bug or a feature, but more of a set of observations and questions regarding the current ICS26 implementation.

Looking at the way that context is cloned in ICS26, modified, and then discarded in the case of an error, given my current (admittedly limited) understanding of how the code's to be used, it looks to me like:

This approach may not work, or will be difficult to implement, for applications like Basecoin which use an Arc<RwLock<_>> under the hood to reference their in-memory state (the clone just clones the pointer and not the underlying state). This is similarly true for applications that make use of an underlying storage mechanism that doesn't actually clone the actual stored data when clone is called (e.g. it may clone a handle to an open database in the filesystem).
Even if the underlying state could be cloned, it's not clear whether it'd be an efficient process at scale. In Basecoin, using a simple HashMap, the space complexity of each clone is O(n) (n = number of elements in the HashMap). Not sure what the time complexity of such a clone would be for a HashMap of size n, but I'd imagine it wouldn't be O(1). With high transaction throughput of IBC packets, we may end up grinding the whole network to a halt while we're doing loads of clones. This would naturally be much worse if the clone were cloning a filesystem-based database (not sure if that's relevant/planned?).
This won't work if there could be multiple simultaneous calls to the ics26_routing::handler::deliver method from different threads for the same context.

Do these observations seem accurate and/or relevant?

If so, it looks like we're trying to use a Clone impl to facilitate an operation that probably deserves its own semantics (i.e. message batch processing). Once we've established its operational constraints, we can come up with an appropriate message batch handling architecture to fulfill those constraints.

For Admin Use

[ ] Not duplicate issue
[ ] Appropriate labels applied
[ ] Appropriate milestone (priority) applied
[ ] Appropriate contributors tagged
[ ] Contributor assigned/self-assigned

These are some very relevant and interesting observations, thank you for writing this! Here's my two cents on the topic ->

I agree that using the ics26_routing::handler::deliver() function will be difficult because it enforces the context clone and also because it expects a Vec of messages as input. I would personally prefer to use the lower level dispatch() function directly instead.

One of the expectations of the Ics26Context trait is that implementating types have access to the underlying KV store via self. So, I can imagine that implementations would look similar to this ->

pub struct Ibc<S: Store> {
    pub store: S,           // A subset of the global provable KV store
    // fields for private store
}

impl<S: Store> Ics26Context for Ibc<S> {}

And a thread-safe Store implementation might be ->

#[derive(Clone)]
struct SharedSubStore<S: Store, P: Path> {
    store: Arc<RwLock<S>>, 
    // ...
}

impl<S: Store, P: Path> Store for SharedSubStore<S, P>

As you rightly noted, cloning the shared pointer here would not be useful. One way to circumvent this problem would be to use the write-ahead logging technique - we add an operational Log (say Vec<(Path, Vec<u8>)>) to the SharedSubStore struct. It is important that writes to this log are infallible and recorded in order. The Store::set() implementation for SharedSubStore must only append new writes to this log and shouldn't touch the original store. And Store::commit() can be used to apply the log writes to the actual store. So this way we have a crude implementation of batch transactions that preserve ACID properties like atomicity and durability. A more robust implementation would ideally write to stable storage. In the context of multi-threading, each clone of the Ibc module will have its own Log and the ModuleManager is responsible for calling Store::commit(). Implementations of the store could also use DB libraries and defer to their batch processing support. sled.rs has a nice API for batch transactions.

Another way to approach this problem would be to use a copy-on-write approach (such as shadow-paging) which is similar to the clone pattern in our current ics26_routing::handler::deliver() implementation. But as you mentioned, the clone must be of a single page or a subset of the entire Ibc state. I am not sure how this can be implemented in our case (i.e. basecoin). libmdbx is a high-performance implementation of an embedded DB that relies on shadow-paging.

I agree that we should formally establish the operational constraints and come up with a model that is flexible and has better support for batch processing and reverts. One thing to note is that ibc-rs handlers must be no_std so we should be careful not to use anything from std::sync there.

This is indeed a problem, all the points you raised are accurate @thanethomson !

In addition to the suggestions from @hu55a1n1, I would add one more idea for a solution, namely transactional semantics. This is in the same spirit with the "message batch processing" semantics: from an abstraction point of view, this interface is developer-friendly and quite powerful, though I'm not convinced if it is an overkill, nor if we want to permit concurrency with it. The interface would replace the following

let mut ctx_interim = ctx.clone();

// Process ...

 // No error has surfaced, so we now apply the changes permanently to the original context.
*ctx = ctx_interim;
Ok(res)

with:

ctx.tx_start();

// Process ...

// in case of error...
ctx.tx_abort(); // will not apply any side-effect of any operation since `tx_start` was called
return Err(...)

 // No error has surfaced, so we now apply the changes permanently to the original context.
ctx.tx_commit();
Ok(res)

Interesting to see that we've pretty much all converged independently towards the same semantics.

Here's the solution I came up with the other day after this issue was first raised during the IBC meeting (I think):

// dummy type definitions
type Any = ();
type IbcEvent = ();
type Error = ();

fn deliver<Tx, Ctx>(tx: &mut Tx, messages: Vec<Any>) -> Result<Vec<IbcEvent>, Error>
where
    Tx: Transactional<Ctx>,
    Ctx: Context,
{
    tx.transaction(|ctx| {
        // A buffer for all the events, to be used as return value.
        let mut res: Vec<IbcEvent> = vec![];

        for any_msg in messages {
            ctx.increment_counter();
            res.push(());
        }

        // No error has surfaced, so we now apply the changes permanently to the original context.
        Ok(res)
    })
}

trait Transactional<T> {
    fn transaction<F, A, E>(&mut self, f: F) -> Result<A, E>
    where
        F: FnOnce(&mut T) -> Result<A, E>;
}

trait Context {
    fn increment_counter(&mut self);
}

struct MockTx {
    ctx: MockCtx,
}

#[derive(Clone)]
struct MockCtx {
    counter: u32,
}

impl Context for MockCtx {
    fn increment_counter(&mut self) {
        self.counter += 1;
    }
}

impl Transactional<MockCtx> for MockTx {
    fn transaction<F, A, E>(&mut self, f: F) -> Result<A, E>
    where
        F: FnOnce(&mut MockCtx) -> Result<A, E>,
    {
        let ctx = self.ctx.clone();

        match f(&mut ctx) {
            Ok(a) => {
                self.ctx = ctx;
                Ok(a)
            }
            Err(e) => Err(e),
        }
    }
}

struct RealTx;

impl RealTx {
    fn persist_counter(&mut self, counter: u32) {
        // perform a side-effectful operation here, which cannot be rolled back.
    }
}

// dummy operation type
type Op = ();

#[derive(Clone, Default)]
struct LogCtx {
    log: Vec<Op>,
}

impl Context for LogCtx {
    fn increment_counter(&mut self) {
        self.log.push(());
    }
}

impl Transactional<LogCtx> for LogTx {
    fn transaction<F, A, E>(&mut self, f: F) -> Result<A, E>
    where
        F: FnOnce(&mut LogCtx) -> Result<A, E>,
    {
        let log = LogCtx::default();

        match f(&mut log) {
            Ok(a) => {
                self.persist_counter(log.len());
                Ok(a)
            }
            Err(e) => Err(e),
        }
    }
}

@romac your solution looks great!

Will it play well with async? I'd imagine that some async ABCI frameworks will need to access the filesystem or a database and will probably do so via async libraries. I'm also not sure what'll happen when the call chain looks like:

The ABCI deliver_tx method will be called by the ABCI server, which will be async in some cases
deliver_tx will call the IBC deliver method in question
deliver will, in turn, potentially need to facilitate an async call.

I think there is an additional problem with our ics26_routing::deliver() implementation apart from what @thanethomson describes above -> The routing module is expected to maintain a lookup table of modules and provide callback functionality for other modules to react to receipt of IBC datagrams (this hasn't been implemented yet IIUC). This means that the routing module can trigger state changes in other modules via callbacks and our current strategy of cloning context and discarding it on error doesn't account for state changes to other modules. I propose that we defer the responsibility of context management to the application/SDK dev altoghether, because it is more likely that the system has better means (such as an immutable and snapshottable DB/KV-store) to deal with (system-wide) reverts for a failed Tx execution with multiple messages. Regarding async, AFAIK multi-threaded executors cannot guarantee execution order so that might be non-deterministic.

cosmos / ibc-rs

Context management and message batching in ICS26 (routing) #27

For Admin Use