thlorenz / rid

Rust integrated Dart framework providing an easy way to build Flutter apps with Rust.
63 stars 4 forks source link

[exploration] Memory Access Synchronization in Rid #2

Closed thlorenz closed 3 years ago

thlorenz commented 3 years ago

@Yatekii rightfully raised some concerns regarding memory management across Dart/Rust and thus I'll attempt the explain my thoughts/plans that I have in this regard here.

Consider this somewhat of a rough draft mainly to serve discussions. It will probably be cleaned up into a technical documentation section once all is fleshed out.

Memory Access Synchronization in Rid

At this point rid assumes everything runs single threaded and none of the examples hold on to references of data across a tick of the Dart/Flutter event loop.

However this will not be true forever as we move into more realistic examples.

The main problem is that once we pass back a pointer of a struct to Dart and it then is passed back in order to gain (in some cases) mutable access to it we totally escaped memory access safety provided to us by the Rust compiler.

A solution that mitigates some of the problems would be to pass back a fully constructed struct instead of a pointer. This would be instantiated immediately on the Dart side and avoid later roundtrips trying to access data that now might be out of sync. I considered this but didn't like the performance implications of fetching all data of a given struct if not all is necessarily needed. I explained this in a bit more detail.

At any rate this still wouldn't solve all synchronization issues.

Assuming single threaded message processing in Rust

Let's start with the simpler case and point out some problems as well as possible solutions.

Model owned Data

Given the following model (I'm omitting all rid attributes as they aren't essential to explore this):

pub struct Model {
    todos: Vec<Todo>,
}

When I access model.todos from Dart I'm getting a *const Vec<Todo> which I can then pass back to Rust in order to instantiate it and access it's length and return references to specific items.

let ptr: *mut Vec<Todo> = &mut *ptr;
ptr.as_mut().unwrap()

However if the todos field is reasigned the original vec would get dropped and the pointer becomes invalid. The same goes for any pointers to vector item references I get.

It turns out that in most Flutter apps this is not a great of a problem as one might think. The main reason is how widgets rebuild due to user interaction. Let's take our example and follow it through:

  1. Render current Todos obtained inside the build method via model.todos
  2. User adds Todo 2.1. Rust pushes a new Todo onto todos vec synchronously 2.2. After the Rust code completes setState or similar is called to trigger a rebuild of the widgets 2.3. On widget rebuild the up to date todo vec is retrieved via model.todos and iterated over to render the updated view
  3. Flutter's event loop ticks and waits for user interaction entering at 2. whenever necessary

The key to making this work is that the Dart code never holds on to the todo vector or any of its items directly, but always gets it fresh going through the model which itself is 'static.

Users would need to learn to always access state updates that way, basically obtaining reference pointers to parts of the model each time a widget build is executed.

Derived Data

What about data that's not owned by the model? It's a bit different there. As an example let's take a vector returned as a result of filtering todos.

impl Model 
  fn filtered_todos(&self) -> Vec<&Todo> { }
}

When I access model.filtered_todos() from Dart I'm getting a struct representing the fat pointer to the Rust vector including it's length, capacity, and a pointer to the first item. Additionally in order to avoid cloning we're actually getting references to todo items held by the model.

Here the same applies as before WRT the todo items as they could be removed later and therefore become invalid. The Vec itself is actually no longer directly accessible from Rust and therefore will stay valid until an appropriate dispose method is invoked from Dart which runs Rust code that drops it.

As before as long as we don't hold on to the filtered todos, but just use them to rerender the widget after refreshing this vector during the build we won't run into any problems as long as no other threads are mutating the model.

However we could also clone each todo in case we'd want to hold on to them across widget builds (something that rarely will be useful) by changing the return signature and cloning each todo instead of referencing the one held by the model.

impl Model 
  fn filtered_todos(&self) -> Vec<Todo> { }
}

Synchronous processing is fine?

Well it kind of is as long as users do the reasonable thing to never hold on to any state across widget builds, but get it straight from the Rust model each time.

Adding threads to the Mix

Once we add threads things get a bit more complex. The main reason is that while the widgets are building the model could be mutated on another thread.

Another potential issue is that the model maybe mutated due to a user action, handled via a message on the main thread while another thread is reading it.

I believe the solution to both these cases can be rwlocks.

Firstly the message handling update method of the model will have to be changed from:

impl Model 
  fn update(&mut self, msg: Msg) { }
}

to something akin to:

struct ModelUpdate {
    // needed to post back with the reqId in order for Dart to relate async responses to the
    // sent message
    reqId: u32,
    write: ??
}

fn update(model: &ModelUpdate, msg: Msg) { }

The write method would be a thin wrapper around write and that part of a 'static rwlock around the Model.

Alternatively an RwLock<Model> could be directly exposed instead (I like a more focused API for rid which also allows adding other related properties, like reqId).

Secondly each thread that would try to read from model would have to get the read lock first. Since it can be 'static as the Model is as well there could be a global rid method that threads could use to do so.

The same approach could be used whenever accessing a property directly or indirectly on the model from Dart. Simplified whenever accessing anything from Dart we could ensure that the read lock of the Model is acquired first. This makes sense since arguably any state we want to share with Dart is either owned by or derived from the main app state, which is held by the main Model.

Thirdly whenever we build a widget we could acquire a readlock by invoking a Rust method from Dart that does so in order to prevent mutations from other threads while the current state is being used to render. One might think that this affects performance a lot, but if threads processing/fetching resources get a hold of the mutable model at the very moment when they are about to update it, i.e after all processing is complete then it won't affect performance too much. Additionally this approach favors rendering over background threads in order to keep the UI responsive.

Next Steps

In parallel to discussing the points made above in this issue I will keep working on the async branch in order to experiment with these different approaches.

I will also add more rid examples that expose some of the above mentioned issues and will show that the async implementation addresses them.

Lastly if some of the above seems incomplete or not fully thought through that's because most likely it is. I'm merely dumping my thoughts/ideas so far in order to fascilitate a discussion.