Some experiments: support runtime via ZSTs

JakkuSakura commented 3 years ago

I have done some experiments about supporting runtimes via ZSTs, and the result is quite satisfying. The idea is to pass runtimes as generic arguments, instead of variables, to avoid unnecessary dynamic dispatch. Some runtimes support it out of the box via TLS or global static and we can make use of it.

use std::fmt::Debug;

pub trait StaticRuntime: Debug + Send + Sync + Copy + Clone + Unpin + 'static {}
impl<T: Debug + Send + Sync + Copy + Clone + Unpin + 'static> StaticRuntime for T {}

/// Spawn a blocking task, maybe in a thread pool(tokio), or in current thread and spawns a new thread(std-async)
pub trait SpawnBlockingStatic: StaticRuntime {
    /// spawn a blocking function
    fn spawn_blocking<T: Send + 'static>(
        func: impl FnOnce() -> T + Send + 'static,
    ) -> Result<JoinHandle<T>, SpawnError>;
}

impl<Runtime: SpawnBlockingStatic> AsyncChannelBuilder for TcpChannelBuilder<Runtime> {
    type Channel = NonblockCompact<NonBlockingTcpStream>;

    fn connect(&self) -> BoxFuture<'static, std::io::Result<Self::Channel>> {
        let addr = self.addr.clone();
        let port = self.port;
        Box::pin(async move {
            let channel = Runtime::spawn_blocking(move || {
                std::net::TcpStream::connect((addr.as_str(), port))
            })
            .unwrap()
            .await?;
            Ok(NonblockCompact(NonBlockingTcpStream::new(channel)))
        })
    }
}

https://github.com/qiujiangkun/async_executors/blob/dev/src/core/static_runtime.rs

najamelan commented 3 years ago

The idea is to pass runtimes as generic arguments, instead of variables, to avoid unnecessary dynamic dispatch.

Passing variables does not cause dynamic dispatch. So where the executor needs no fields, as in AsyncStd {}, it is a zero sized type.

The difference between generics and parameters is more one of ergonomics, but as generics are contagious, I think most people feel parameters are more ergonomic than generics. More over, parameters can be generics:

pub lib_function( exec: impl Spawn + Clone + Send )
{}

This is a generic function. For every different type it is called with, the compiler will create a different one in the binary.

JakkuSakura commented 3 years ago

The problem is SpawnHandle. I have to write

fn foo<R, T: SpawnHandle<R>>(exec: T) {}

instead of

fn foo<Runtime: SpawnHandleStatic>() {}

Also, passing exec around is not fun

fn foo1<R, T: SpawnHandle<R>>(exec: T) {
    foo2(exec); // does not compile
    foo3(exec);
}
fn foo2<R, T: SpawnHandle<R>>(exec: T) {}
fn foo3<R, T: SpawnHandle<R>>(exec: T) {}

It does not compile, unless you require Copy or Clone on T

fn foo1<R, T: SpawnHandle<R> + Copy>(exec: T) {
    foo2(exec);
    foo3(exec.clone()); // or this
}
fn foo2<R, T: SpawnHandle<R> + Copy>(exec: T) {}
fn foo3<R, T: SpawnHandle<R> + Copy>(exec: T) {}

+ Copy is just syntax noise. Sometimes you need exec: Unpin, sometimes you need exec: Debug when you store runtimes in your struct. The list goes on and gets verbose overtime.

If you requires Copy, exec is probably a ZST, so why are you passing exec around? exec is still useful when your runtime still contains some primitive infomation, though.

najamelan commented 3 years ago

Ok, I understand what you mean now. Let me hopefully help you forward a bit:

I have to write

fn foo<R, T: SpawnHandle<R>>(exec: T) {}

instead of

fn foo<Runtime: SpawnHandleStatic>() {}

The fact we have to put the return type on the trait is unavoidable however. The other option is a trait that isn't object safe. If you don't need to store the executor however you don't need object safety. In that case you can have a look at the agnostik crate which has traits that aren't object safe. If you don't need traits, another possible way to solve this is by creating a struct that holds an executor in an enum. This is what the agnostic_async_executor crate does.

Also, passing exec around is not fun

That is true but it's a design choice. That is, every object and function is explicit about its needs. As opposed to global spawn functions. I once wrote a crate for executor agnostic global spawning, but it's quite a hassle and nobody seemed much interested. I also started to appreciate the model of passing around much more since then, so I stopped developing it. Especially when taking things further into structured concurrency, the global spawn function loses all it's interest. async_nursery which provides structured concurrency builds on async_executors. The old and unmaintained crate is naja_async_runtime if you want to see how it was done.

fn foo1<R, T: SpawnHandle<R>>(exec: T) {
    foo2(exec); // does not compile
    foo3(exec);
}
fn foo2<R, T: SpawnHandle<R>>(exec: T) {}
fn foo3<R, T: SpawnHandle<R>>(exec: T) {}

It does not compile, unless you require Copy or Clone on T

Yes, this model implies you are exact about your needs. You can create a trait alias. Have a look at the examples in async_executors. spawn_handle_multi and trait_set both show a way on how to streamline this.

If you requires Copy, exec is probably a ZST, so why are you passing exec around? exec is still useful when your runtime still contains some primitive infomation, though.

Because if you want to be executor agnostic, you don't know if exec will be Copy. It depends on which executor the user wants to use. Take your example from above:

pub trait StaticRuntime: Debug + Send + Sync + Copy + Clone + Unpin + 'static {}

Most executors can't implement this. Remember glommio is !Send, !Sync, !Copy and maybe not even Unpin.

By using traits we allow the consumer to require exactly what they need and nothing more. That way as many executors as possible can work for them. That way the author of a library doesn't exclude part of the ecosystem from using their lib.

For convenience, all the spawn traits only ever require &self, not &mut self and all executors implement clone. So you can always count on those with async_executors.

JakkuSakura commented 3 years ago

I see. The current implementation is the most generic, but not the most elegant in some cases, where I can manually implement something like pub trait StaticRuntime: Debug + Send + Sync + Copy + Clone + Unpin + 'static {} However, CustomSpawnHandle is terrible. Because if I write a library with async_executors, then every addition of spawn_handle<T> is a breaking change.

pub trait CustomSpawnHandle : SpawnHandle<String> + SpawnHandle<u8> + Send + Sync {}

I can't

pub trait<T> CustomSpawnHandle : SpawnHandle<T> + Send + Sync {}

It can be fixed by 1) having a SpawnHandleAny, 2) auto-implement SpawnHandle<Output> for T: SpawnHandleAny

/// Let you spawn and get a [JoinHandle] to await the output of a future.
pub trait SpawnHandleAny {
    /// Spawn a future and return a [JoinHandle] that can be awaited for the output of the future.
    fn spawn_handle<Output, Fut>(&self, future: Fut) -> Result<JoinHandle<Output>, SpawnError>
    where
        Fut: Future<Output = Output> + Send + 'static,
        Output: 'static + Send;
}
impl<Output, T> SpawnHandle<Output> for T {
    fn spawn_handle_obj(
        &self,
        future: FutureObj<'static, Output>,
    ) -> Result<JoinHandle<Output>, SpawnError> {
        <T as SpawnHandleAny>::spawn_handle(self, future)
    }
}

Because if you want to be executor agnostic, you don't know if exec will be Copy. It depends on which executor the user wants to use.

Most executors can't implement this. Remember glommio is !Send, !Sync, !Copy and maybe not even Unpin.

Copy makes my executor a bit limited, but most executors relies on TLS and we can make use of it. tokio::task::spawn(), glommio_crate::Task::local(), async_std::task::spawn. They are all similar to std::thread::spawn. I think we can make StaticRuntime out of them. Don't consider making TokioCt a StaticRuntime, just make a TokioStatic.

I want to be as executor agnostic as possible, but I want to require some features. If a runtime does not have these features, then the user just choose another runtime. I would like to choose a set of functionalities, rather than a greatest-common-devidor of all executors included here.

In the end, StaticRuntime is a helper trait to indicate a runtime is ZST and utilizes global variables or TLS. It can be created in user's code, it doesn't matter. However, I hope to implement static versions of existing runtimes, if it has support of TLS spawning out of the box. Passing executors in args seems fine now, and traits SpawnHandleStatic can be replaced like SpawnHandleAny.

najamelan commented 3 years ago

then every addition of spawn_handle<T> is a breaking change

Not really, as there isn't any known implementation of the trait that can spawn some types and not others. Maybe I need to document that but the intention here is that any executor implementing it can spawn all types, so you can never break a dependent crate by adding more types. And all the executors in async_executors can spawn all types.

It can be fixed by 1) having a SpawnHandleAny, 2) auto-implement SpawnHandle for T: SpawnHandleAny

This would be great. If you can make it compile, please link me a commit I can check out to see it pass the integration tests. If this works I would absolutely adopt it.

JakkuSakura commented 3 years ago

Not really, as there isn't any known implementation of the trait that can spawn some types and not others. Indeed, but it lacks a single trait that can indicate so.

Say we have a foo in some crate

fn foo(exec: impl SpawnHandle<String>); // some external crate

pub trait CustomSpawnHandle : SpawnHandle<String> {}
fn main(exec: impl CustomSpawnHandle) {
    foo(exec);
}

Then external crate wants to use SpawnHandle<u8>. Alghough any executor that worked previously would work with SpawnHandle<u8>, but it is an breaking change since API changed

fn foo(exec: impl SpawnHandle<String> +  SpawnHandle<u8>); // some external crate

pub trait CustomSpawnHandle : SpawnHandle<String> {} 
fn main(exec: impl CustomSpawnHandle) { 
    foo(exec);// it wont compile
}

najamelan commented 3 years ago

Yes, I see if you do it like that it can be a breaking change. I will clarify this issue in the documentation. However you can avoid it:

// lib

pub trait FooExec: SpawnHandle<String>;
impl<T> FooExec for T where T: SpawnHandle<String> {}

pub fn foo( exec: impl FooExec ) {}

// in a next version:
pub trait FooExec: SpawnHandle<String> + SpawnHandle<u8>;
impl<T> FooExec for T where T: SpawnHandle<String> + SpawnHandle<u8> {}

// client code

use 
{
   lib::{ foo, FooExec },
   async_executors::AsyncStd,
};

fn main() 
{ 
    foo( AsyncStd ); // it will compile
}

The client code never has to change. Don't get me wrong. I tried very hard to have SpawnHandleAny. It is absolutely the interface I want. But the type system won't allow it. If we can make it work, you make my day.

Oh, I just see what you did. You put the generic on the function. Well, the problem is that with that, it is no longer object safe. That is a problem though. In a library you can now never store the executor without putting generics everywhere it goes. If you want this, then the agnostik crate is what you want. This is exactly what they do. But you can never have Box< dyn SpawnHandleAny > and put it in a struct or local variable. You now have to do:

struct MyStruct<Exec>
{
   exec: Exec
}

impl<Exec: SpawnHandleAny> MyStruct<Exec>
{
   pub new( exec: Exec ) -> Self
   {
       Self{ exec }
   }
}

And everywhere MyStruct goes and get's stored and get's passed to functions, everything needs to be generic. That is what I mean when I say generics are contagious. What we want is SpawnHandleAny but it must be object safe.

JakkuSakura commented 3 years ago

And everywhere MyStruct goes and get's stored and get's passed to functions, everything needs to be generic. That is what I mean when I say generics are contagious. What we want is SpawnHandleAny but it must be object safe.

I see, but we are talking about impl SpawnHandleAny, which does not need it to be object safe. In rust there is no way to make a object safe SpawnHandleAny. I can use impl SpawnHandleAny in place of a impl FooExec. Users can choose either way. If they want to avoid dynamic dispatch or manually specifying everything they'd spawn_handle, they can use SpawnHandleAny; If they want a object safe one, they manually implement a FooExec.

What if I want to spawn_handle this?

async fn fut() -> impl Future<Output = ()> {
    async {

    }
}

I probably need this?

pub trait FooExec: SpawnHandle<String> + SpawnHandle<u8> + SpawnHandle<impl Future<Output = ()>>;
impl<T> FooExec for T where T: SpawnHandle<String> + SpawnHandle<u8>  + SpawnHandle<impl Future<Output = ()>> {}

najamelan commented 3 years ago

What if I want to spawn_handle this?

You need SpawnHandle< Pin<Box<dyn Future<Output=()>>> >

async_executors used to have the not object safe traits. They were removed in d9a5fe54a63fe03598c2e14674c143ff78253f82 because it creates 2 extra traits, their implementations, examples, tests, documentation + making sure it works correctly on Wasm just to avoid users having to write 1 line of code to list the types they need.

I do not consider it worth the maintenance burden to save users from one line of code. Seemingly it failed riker. I havn't looked at their code to see if it really wasn't possible, but they have to spawn user supplied types I'm pretty sure. That being said they also needed it to be object safe. So they made async_agnostic_executor using a struct...

That being said, agnostik already has this functionality. They also already have a trait for it, so ultimately what I would be open for is implementing the traits from agnostik as that improves interoperability, and we don't have to have more traits in here, just implementations + tests.

We would need to be able to create the agnostik JoinHandle however, so they would have to add From<AsyncStdJoinHandle> for agnostik::JoinHandle> and same for tokio.

JakkuSakura commented 3 years ago

In performance critical programs, dyn are strongly discouraged this way unless absolutely neccessary. That's the reason I suggest we keep a non-object-safe version for this cenerio. It wouldn't add much work and everything is compatible.

JakkuSakura commented 3 years ago

There is an object-safe approach. We can always use remote_handle and Spawn<()> altogether to spawn_handle anything, with a bit of performance loss

najamelan commented 3 years ago

I think the solution here is to implement agnostik::Executor behind a feature flag. That way we bring a non- object safe, non generic spawn trait back without having to maintain it, whilst improving interop.

This is currently blocked on: https://github.com/bastion-rs/agnostik/issues/4

JakkuSakura commented 3 years ago

Can we have a object safe SpawnBlocking that takes Box<dyn FnOnce + Send>?

najamelan commented 3 years ago

Box<dyn FnOnce + Send>

Could you elaborate a bit? This isn't valid Rust code AFAICT. What is the signature of the FnOnce? And what is it you are trying to achieve?

JakkuSakura commented 3 years ago

Something like SpawnBlockingExt::spawn_blocking(&self, Box<dyn FnOnce() + Send>). So that this SpawnBlocking can be used without generics

najamelan commented 3 years ago

@qiujiangkun sorry for the delay. I was very busy. I am looking into this now. I don't quite understand the need as the SpawnBlocking trait is not currently generic. The generic is on the method. How does it bother you? Normally the compiler will automatically infer the types. You never have to write them out.

The only thing is that now the trait is not object safe. We could make it object safe by adding the method sig you propose to it. No need for an extension trait for that. I think it's a good idea, so I'll go ahead and implement that.

najamelan / async_executors

Some experiments: support runtime via ZSTs #8