kamu-data / kamu-cli

Next-generation decentralized data lakehouse and a multi-party stream processing network
https://kamu.dev
Other
304 stars 13 forks source link

Implement batch loading of event sourcing aggregates (Flows, Tasks) #850

Open zaychenko-sergei opened 1 month ago

zaychenko-sergei commented 1 month ago

Currently we do not support batch loading of flows & tasks, and load them in sequential loops like below:

            // TODO: implement batch loading
            for flow_id in relevant_flow_ids {
                let flow = Flow::load(flow_id, self.flow_event_store.as_ref()).await.int_err()?;
                yield flow.into();
            }

Knowing a set of queries (id's in our case), it should be possible to load all related events in 1 SQL query in chronological order. Then event_sourcing crate could post-process the loaded collection of events, and re-construct multiple aggregates in 1 call.

This should optimize database roundtrips, esp. for listing flows:

image

s373r commented 1 month ago

Cool, tracing is the best visual way to know that optimization is needed.

Yes, we knew this before, but when you can see this need on the graph, superb!