tizoc / ocaml-interop

OCaml<->Rust FFI with an emphasis on safety.
MIT License
135 stars 22 forks source link

should the runtime be a thread-local variable? #42

Open c-cube opened 2 years ago

c-cube commented 2 years ago

using static mut is very dangerous [^1] and almost impossible to get right. Should it be instead a form of either once_cell/lazy_static, or, better, a thread-local variable with Arc<Mutex<OCamlRuntime>>?

I'm asking because ocaml-rs apparently uses runtime::recover_handle to access the runtime from within function stubs, despite it being advertised as test-only. Which, accessorily: should it be #[cfg(test)], too?

[^1] https://github.com/rust-lang/rust/issues/53639

tizoc commented 2 years ago

@c-cube ocaml-rs uses runtime::recover_handle in the same way ocaml-interop does internally. Those stubs are called from OCaml, and when that happens the caller is assumed to have access to the runtime handle (because the caller is assumed to be an OCaml function). But because the OCaml runtime doesn't pass any value to represent such access to the runtime runtime::recover_handle (which I agree is an ugly hack) is used by stubs to cheat here, and obtain that virtual handle that the type system will use to validate operations that require access to the OCaml runtime.

Users of ocaml-interop should avoid using runtime::recover_handle (except maybe for tests) because it is very unsafe, ocaml-rs just happens to be a special case.

Lupus commented 1 year ago

One more user of runtime::recover_handle here. I'm trying to integrate async Rust with OCaml and here's the case when I need runtime handle out of thin air:

#[ocaml::func]
#[ocaml::sig("runtime -> (int -> promise) -> unit")]
pub fn runtime_test(rt: ocaml::Pointer<Runtime>, f: ocaml::Value) {
    let f = ensure_rooted_value(f);

    rt.as_ref().spawn(async move {
        let mut page_nb = 0;
        let gc = unsafe { ocaml::interop::OCamlRuntime::recover_handle() };
        let f_callable = ocaml::function!(f, (n: ocaml::Int) -> CamlPtr<Promise>);
        loop {
            f_callable(&gc, &page_nb).unwrap().await.unwrap();
            page_nb = page_nb + 1;
        }
    });
}

I have single-threaded Rust executor, that notifies Lwt via Lwt_unix notification when there are pending tasks, and Lwt side then calls the executor to move its ready tasks forward. This even somewhat works. The contract that I have is that OCaml is keeping the executor reference alive and will properly stop it before terminating the OCaml runtime (in my case - exit the application as it's an OCaml application).

This should be a safe thing to do, everything runs on the same thread as OCaml code, properly rooted OCaml values are safe to be referenced in Rust async taks running in that thread-local executor, and as I externally guarantee that OCaml part will outlive the executor, I should have the OCaml runtime legally available to my async Rust code. The only way to do that is probably using runtime::recover_handle right now, as no other API is exposed to cover this case.

tizoc commented 1 year ago

@Lupus these patterns cannot yet be safely expressed with ocaml-interop, I am trying to get this solved for the next release which will support OCaml 5 and threads, but progress is slow because I am short on available time and energy to work on it.

A safe version of recover_handle() would probably be one that obtains the OCaml runtime lock by calling caml_acquire_runtime_system(), and that releases it on drop by calling caml_release_runtime_system() (probably wrapped by some thread-local counter to avoid issues with nested calls). Also caml_c_thread_register() probably needs to be called every time it is obtained in case it the lock is obtained from a new Rust thread (I know this doesn't matter for your specific case but in the general case it will be required).

https://v2.ocaml.org/manual/intfc.html#s:C-multithreading

I should have the OCaml runtime legally available to my async Rust code. The only way to do that is probably using runtime::recover_handle right now, as no other API is exposed to cover this case.

Yes, this is true in your case because all your Rust functions are "entered" through OCaml calls, and the Rust code never runs in parallel to the OCaml code, only concurrently, right?

Lupus commented 1 year ago

I am trying to get this solved for the next release which will support OCaml 5 and threads, but progress is slow because I am short on available time and energy to work on it.

No worries, and thank your for your work on this project! It's more that sufficient for me to just know that the recover runtime handle part of the interface won't just disappear because of some new way of writing tests that allows to avoid unsafe static variable :)

A safe version of recover_handle() <....>

Sounds good, I could probably obtain it and save it along with my thread-local executor, as long as running the functions that you mentioned one more time on OCaml's main thread is okay.

all your Rust functions are "entered" through OCaml calls, and the Rust code never runs in parallel to the OCaml code, only concurrently, right?

Yes, this is correct for my use case. I'm extending OCaml with Rust and try to go further than just synchronous primitives.