busstoptaktik / geodesy

Rust geodesy
Apache License 2.0
63 stars 6 forks source link

Grid provider for Plain #77

Closed busstoptaktik closed 8 months ago

busstoptaktik commented 8 months ago

The GridCollection struct provides access to Arc<dyn Grid>, such that any grid needed by any Operator in any instantiation of Plain, always refer to the same heap allocation.

Unfortunately, this requires one line of unsafe{...} code.

We could have achieved almost-the-same, by building the collection into Plain herself. But this would also require that the InnerOp::new() methods switch to take a &mut Context, which may have cascading effects all through the code. Hence I prefer this solution.

@Rennzie do you have any opinions about this?

Rennzie commented 8 months ago

No strong opinions. It looks sensible to me.

For my own benefit, could you explain why the unsafe code is required?

busstoptaktik commented 8 months ago

For my own benefit, could you explain why the unsafe code is required?

Not really sure why, but it does not compile without. This code:

unsafe { GRIDS.lock().unwrap().get_grid(name, self.paths.clone()) }

modifies (i.e. may modify) this static struct:

static mut GRIDS: Mutex<GridCollection> =
    Mutex::new(GridCollection(BTreeMap::<String, Arc<dyn Grid>>::new()));

which I believe should be (thread)-safe, due to the Mutex protection. But if I leave out the unsafe barrier, I get this message from the friendly compiler:

error[E0133]: use of mutable static is unsafe and requires unsafe function or block
   --> src\context\plain.rs:256:9
    |
256 |         GRIDS.lock().unwrap().get_grid(name, self.paths.clone())
    |         ^^^^^^^^^^^^ use of mutable static
    |
    = note: mutable statics can be mutated by multiple threads: aliasing violations or data races will cause undefined behavior

...which is formally correct, but which I assumed I had mitigated by putting the access behind a Mutex

Rennzie commented 8 months ago

Ah interesting, thanks for those details @busstoptaktik.

I posed the question to Github Copilot which said:

The unsafe block is required here because you're dealing with a mutable static variable, GRIDS. In Rust, mutable statics are inherently unsafe due to potential data races, even when they're protected by a Mutex.

The Mutex does ensure that only one thread can access the data at a time, which prevents data races, but it doesn't change the fact that you're dealing with a mutable static. The Rust compiler doesn't know that you're using a Mutex to protect the data, so it still requires an unsafe block.

A bit more reading suggests that it's because it breaks the borrow checking rules of having exactly on mutable borrow at any point. I take it to mean that by having a global mut variable it basically prevents any other mut variable from being allowed.

Deeper down the rabbit hole I came across the once_cell crate which provides a Lazy type for working around this issue. I tried it out on a branch and I could remove the unsafe code with very few changes. See try-once-cell. I'd be curious to know if you see any short comings with this approach? I'm still very new to threaded programming and Web Workers come with safe guards built in so I've not ever dealt with this type of thing before.

busstoptaktik commented 8 months ago

Huge thanks, for the analysis and solution, @Rennzie!

I have merged & pushed this on main