davidcole1340 / ext-php-rs

Bindings for the Zend API to build PHP extensions natively in Rust.
Apache License 2.0
581 stars 62 forks source link

Impl send and sync for ZendCallable #191

Open Dennis-Zhang-SH opened 1 year ago

Dennis-Zhang-SH commented 1 year ago

If the params comes from php, then I think it's safe to impl send and sync for ZendCallable? because all the variables are managed by php runtime.

ju1ius commented 1 year ago

because all the variables are managed by php runtime.

As far as I understand that's precisely what makes it not safe. The PHP runtime is not thread safe, even when compiled with --enable-zts. Your callable could be a closure referring to refcounted objects that could be mutated or garbage-collected at any time from another thread and there's nothing Rust could do about that (to my knowledge at least).

Now you could also cheat and wrap your callable in a newtype and impl Send + Sync on that, but that would only be a lie to bypass the compiler checks...

Dennis-Zhang-SH commented 1 year ago

because all the variables are managed by php runtime.

As far as I understand that's precisely what makes it not safe. The PHP runtime is not thread safe, even when compiled with --enable-zts. Your callable could be a closure referring to refcounted objects that could be mutated or garbage-collected at any time from another thread and there's nothing Rust could do about that (to my knowledge at least).

Now you could also cheat and wrap your callable in a newtype and impl Send + Sync on that, but that would only be a lie to bypass the compiler checks...

I guess you are right, I am not really familiar with php runtime, maybe we could impl Send + Sync for our own Zvals(those managed in rust lib), let's say we have rust function export to php called spawn(f: ZendCallable, params: Vec[&dyn IntoZvaldyn]), in that function, we copy the closure and params into our owed memory, leave the original memory untouched, thus for Zval and &Zval<'a>, we should handle it differently, what do you think?

Dennis-Zhang-SH commented 1 year ago

The following code would result in segmentation fault:

#![cfg_attr(windows, feature(abi_vectorcall))]

use ext_php_rs::prelude::*;

use once_cell::sync::OnceCell;
use tokio::runtime::Runtime;

static INSTANCE: OnceCell<Runtime> = OnceCell::new();

#[php_function]
pub fn start_runtime() -> Option<String> {
    Runtime::new().map_or_else(
        |_| None,
        |rt| {
            INSTANCE
                .set(rt)
                .map_err(|e| format!("error: {:?}", e))
                .err()
        },
    )
}

// #[php_function]
pub fn spawn() -> PhpResult<()> {
    let rt: &Runtime = INSTANCE.get().unwrap();
    rt.spawn(async move {
        let f = ZendCallable::try_from_name("hello").unwrap();
        f.try_call(vec![]).unwrap();
    });
    Ok(())
}

#[php_module]
pub fn get_module(module: ModuleBuilder) -> ModuleBuilder {
    module
}
<?php
var_dump(start_runtime());
for ($i=0; $i < 100000; $i++) { 
    spawn();
}

function hello() {
    echo "Hello";
}
sleep(3);

But if you remove the tokio runtime spawn, and simply call try_call method, it would be fine, wonder why this happened.

ju1ius commented 1 year ago

If you're going to spawn threads, I would suggest you to follow a method similar to what the parallel extension does (CSP concurrency AKA message passing over channels).

You'll avoid a lot of footguns, because PHP is simply not designed to share memory accross thread boundaries. Here's a link to the parallel repo.

Dennis-Zhang-SH commented 1 year ago

If you're going to spawn threads, I would suggest you to follow a method similar to what the parallel extension does (CSP concurrency AKA message passing over channels).

You'll avoid a lot of footguns, because PHP is simply not designed to share memory accross thread boundaries. Here's a link to the parallel repo.

It is one way to share memory but not always, besides I don't really understand why it's in sequence, to me a spawn is like you never know when it's gonna happen but it will return a handle, you can wait for that handle.

danog commented 1 year ago

I've added async support to ext-php-rs in danog/php-tokio, it seems like this is what you were looking to do when you opened this issue :)

ju1ius commented 1 year ago

I've added async support to ext-php-rs in danog/php-tokio, it seems like this is what you were looking to do when you opened this issue :)

Good if it works for you, but for me that is a huge no. Your library has basically guaranteed undefined behaviour...

danog commented 1 year ago

It looks scary, but it's actually perfectly safe: rust simply does not understand the concept of fibers, so it doesn't understand that an awaited value is never dropped from the current scope until the result is ready, so that's just a neat workaround to make everything work cleanly without hacks and copies :)

ju1ius commented 1 year ago

it's actually perfectly safe

No, it is not. Rust references are not raw pointers. If you want an unbounded lifetime, you must use a raw pointer. Transmuting the lifetime of a reference is guaranteed undefined behaviour.

danog commented 1 year ago

No, it's actually completely safe, and here's why.

This is needed because of https://github.com/danog/php-tokio/blob/master/src/event_loop.rs#L72, where we're actually using C setjmp/longjmp to jump in and out of the suspend_on rust function.

Rust thinks we're Sending the Future to another thread (tokio's event loop), where it may be used even after its lifetime expires in the main (PHP) thread.

In reality, the Future is only used by Tokio until the result is ready.

Rust does not understand that when we suspend the current fiber in suspend_on, we basically keep alive the the entire stack, including the Rust stack and the Future on it, until the result of the future is ready.

Once the result of the Future is ready, tokio doesn't need it anymore, the suspend_on function is resumed, and we safely drop the Future upon exiting.

Here's the logic that's powers everything: https://github.com/danog/php-tokio/blob/master/src/event_loop.rs#L63

ju1ius commented 1 year ago

Sorry, but that was not my point. I have no doubt that you have carefully checked the semantics of this.

Still, transmuting the lifetime of a reference like this is always undefined behaviour, and the original author of this trick is well aware of that fact:

You can’t “turn off the borrow checker” in Rust, and you shouldn’t want to. Rust’s references aren’t pointers, and the compiler is free to decimate code that tries to use references as though they are. If you need raw pointer behaviour in Rust, don’t use this, use Rust’s actual raw pointers, which don’t make the same aliasing guarantees to the compiler. However, if you would like to pretend the borrow checker doesn’t exist for educational purposes and never in production code, this macro that will suppress many (though not all) borrow checker errors in the code it’s applied to.

danog commented 1 year ago

I'm well aware that this is undefined behavior actually no, this is not undefined behavior for the reasons described above, but the current implementation of tokio does not use futures in any way after resolution (and it should not, it makes no sense to poll an already resolved future, they are deregistered from the event loop as soon as they are resolved), which makes php-tokio possible. In case the behavior changes in future (and I see no logical reason why it should), I can just switch to copying passed arguments instead of erasing lifetimes.

If you're worried about undefined behavior in a library that interacts with a C programming language, I would be more worried about the large number of (unavoidable) unsafe blocks inside of ext-php-rs itself, they're just as likely to cause a segfault as the erasure of lifetimes in php-tokio.

ju1ius commented 1 year ago

I'm well aware that this is undefined behavior,

Then why not just using raw pointers instead? 😉

If you're worried about undefined behavior in a library that interacts with a C programming language,

Well, even with all it's flaws and despite being written in C, PHP is actively maintained, well-tested and widely used, so that doesn't worry me that much.

I would be more worried about the large number of (unavoidable) unsafe blocks inside of ext-php-rs itself, they're just as likely to cause a segfault as the erasure of lifetimes in php-tokio.

Indeed, this library is fundamentally flawed in lots of ways - which made me give-up on using it, so I'm not worried about that either.

Anyway, as I initially said, if you're aware that this is UB and you don't care because it works for you, then by all means use it! 👍🏻 Just do not recommend it to others.