jamsocket / aper

A Rust data structure library built on state machines.
https://aper.dev
MIT License
299 stars 12 forks source link

could StateMachine::apply work without cloning the entire state for each mutation? #40

Closed httnn closed 1 year ago

httnn commented 1 year ago

this is more of a question than an issue, so please let me know if you'd want this moved elsewhere.

i've been looking for a data structure in Rust that would allow for remote replication and easier undo/redo systems. aper seems like it might fit the bill pretty well! however, from glancing over the documentation and examples, i have just one concern: StateMachine::apply requires to clone the entire state for each mutation. this is probably fine for smaller states like in the guide and examples, but i could imagine this becoming a bottleneck in a bigger state struct. do you think it could be possible for StateMachine::apply to take mut self instead of &self? then it could just return Self back to the caller and cloning wouldn't be necessary. alternatively, perhaps could it take &mut self and simply mutate the state directly without returning Self?

paulgb commented 1 year ago

Hi Max, good question!

It would be possible, and it actually worked like that originally. One of the problems I had with that approach is that it's hard to handle conflicts -- ideally, if a conflict occurs, we want to guarantee that the data structure has not been modified at all, so that transitions are "atomic". But if we allow mutating the state directly in apply, there is no way to do that unless we have a copy to revert to. So either way, we would be making a copy.

One solution, if we assume that conflicts are rare, might be to only occasionally snapshot the current state, and keep a log of transitions between snapshots. In case of a conflict, we would replay the transitions to recreate the state just before the conflicting transition.

The approach I chose instead is to lean more heavily on immutable.rs types, which allows two copies of the same data structure to share the same data in memory for the parts that overlap. This means that clone is O(1). The tradeoff is that read/write operations are more expensive, but not worse than O(log n).

A nice thing about this approach is that we have to copy quite a bit anyway to support optimistic updates (since we need to keep an "optimistic" and "server-sent" state), so by making copy cheap, those necessary copies become cheaper too.

httnn commented 1 year ago

thanks for the quick and thorough reply! 🙌

using aper with im-rs makes a lot of sense indeed and i also overlooked that issue with conflicts. they are not really a reality in my use-case tho, and i also don't require the optimistic state as i can always wait for my "server" (another process communicating via IPC) to respond with possible changes. having said that, perhaps choosing aper would be over-engineering for my case and maybe i could just get by with a more vanilla Rust solution perhaps using im-rs.

paulgb commented 1 year ago

Yes, a vanilla Rust solution may be better in your case, especially if you don't need the websocket server component.

If you wanted to take a hybrid approach but still use some of the Aper data types as building blocks, that's also an option, because they are designed to work independent of the sync protocol. They're basically just regular data structures except with a serializable representation of mutation intents.