Azure / azure-sdk-for-rust

This repository is for the active development of the Azure SDK for Rust. For consumers of the SDK we recommend visiting Docs.rs and looking up the docs for any of libraries in the SDK.
MIT License
709 stars 246 forks source link

Client methods traits should desugar async functions to return `impl Future` #1796

Open heaths opened 1 month ago

heaths commented 1 month ago

To avoid a valid lint, we should desugar FooClientMethods methods as async fn foo() -> Result<Response<T>> into fn foo() -> impl Future<Result<Response<T>>> instead.

See https://github.com/Azure/azure-sdk-for-rust/pull/1773#discussion_r1735372332 for context.

analogrelay commented 1 month ago

The main concern here is the Send bound (and to a lesser-degree, the Sync bound). In order to spawn a task in tokio and async-std, it must only capture Send data. This is required in order to support work-stealing, since that would require Sending the data for a task from one thread to another.

If you use async fn, you have a trait like this:

trait DoAsyncThing {
    async fn do_async_thing(&mut self) -> bool;
}

Which is desugared to

trait DoAsyncThing {
    fn do_async_thing(&mut self) -> impl Future<Output = bool>;
}

Looking only at the trait, it is not possible to verify that do_async_thing returns a Future that is also Send. Which means code like this fails (see Playground):

async fn spawn_async_thing(mut thing: impl DoAsyncThing + Send + 'static)
{
    tokio::task::spawn(async move {
        if !thing.do_async_thing().await {
            println!("nope!")
        }
    });
}

It fails because tokio::spawn requires that thing.do_async_thing() return a value that is Future + Send, but the trait does not guarantee that.

error: future cannot be sent between threads safely
   --> src/lib.rs:15:5
    |
15  | /     tokio::task::spawn(async move {
16  | |         if !thing.do_async_thing().await {
17  | |             println!("nope!")
18  | |         }
19  | |     });
    | |______^ future created by async block is not `Send`

However, I think we can move forward without being concerned about that for now, given a few mitigating circumstances:

  1. This restriction only applies when using the trait through a generic parameter. For example, if there was a type MyAsyncThing that implemented AsyncThing and it's do_async_thing() was Send, then this code would work fine (see Playground). The compiler can see the concrete type here and as long as <MyAsyncThing as DoAsyncThing>::do_async_thing()'s implementation only captures Send data, it will return a future that implements Send and all will be well:
async fn spawn_async_thing_2(mut thing: MyAsyncThing)
{
    tokio::task::spawn(async move {
        if !thing.do_async_thing().await {
            println!("nope!")
        }
    });
}
  1. Rust is working on stablizing a syntax to allow consumers of a trait to require additional bounds on method return types of that trait. This would allow someone to require impl AsyncThing<do_async_thing(): Send>, which would be usable in tokio::spawn. Once this feature lands in stable rust, users are no longer restricted by our lack of Send bound on the impl Future returned by an async fn. In fact, this code works today in Nightly Rust (see Playground):
#![feature(return_type_notation)]

async fn spawn_async_thing_3(mut thing: impl DoAsyncThing<do_async_thing(..): Send> + Send + 'static) {
    tokio::task::spawn(async move {
        if !thing.do_async_thing().await {
            println!("nope!")
        }
    });
}
  1. Finally, we have a third option, which is that if users have this concern, we can make an "extension" trait named something like SendDoAsyncThing that adds the additional bounds, optionally using a helper proc-macro like trait_variant. Anyone who needs to spawn a task that uses a generic impl type can refer to this trait. Concrete types that impl DoAsyncThing and return Futures that are Send would automatically also impl SendDoAsyncThing. This isn't a pretty solution, but as an as-needed stopgap until the return type notation feature arrives, it seems reasonable.

On top of all those mitigations, I think that requiring a Send bound on our async fns in traits would be overly restrictive. It imposes a requirement on implementations based on the needs of some consumers, which I generally prefer to avoid. Single-threaded or non-work-stealing async runtimes may not require Sendable futures and imposing this requirement would be unnecessary.

Having said all that, I also concede that the use cases for non-Sendable futures here is fairly minimal. We expect the vast majority of our users to be using runtimes like tokio and async_std. Having explained the trade-offs and mitigations from my own perspective, I'm also alright if we look at this data and decide that it's simpler to just impose the Send bound on all our async fn return types and be done with it.

So that's just my $0.02 on the topic. For now, in the Cosmos crate, I've gone ahead and used async fn and #[expect]ed the lint, to make clear that we understand the ramifications.

heaths commented 1 month ago

and #[expect]ed the lint

Won't that require us to use the 2024 edition instead of 2021 like we do now? We'd have to discuss if we want to jump that far ahead. We generally don't use the latest-and-greatest. Heck, even Go barely made the case for moving to 1.18 to use generics but has no plans as of yet to use newer (despite some interesting use cases for range-over-funcs).

heaths commented 1 month ago

And before we get too far into the weeds to make everything object-safe - which I'm not opposed to - we may want to reconsider how to support mocking. I'm open to other suggestions and, frankly, would prefer not to have the clients implement a trait full of client methods. It's fragile anyway and can lead to breaking changes for devs who implement the trait for mocking.

I considered making a MockTransport that is easier to use to return just the models (or Err(E)) but the setup for that isn't obvious.

analogrelay commented 1 month ago

Won't that require us to use the 2024 edition instead of 2021 like we do now?

No, it doesn't require edition 2024, but it does require rustc 1.81 and our current MSRV is 1.76. Updating to 1.81 would also enable the reason attribute, so we could write: #[expect(async_fn_in_trait, reason = "... probably a reference to this issue for context ...")]. For now, I'm using #[allow] and a // REASON: comment pattern so we can go back later and change them easily.

analogrelay commented 1 month ago

And before we get too far into the weeds to make everything object-safe - which I'm not opposed to - we may want to reconsider how to support mocking.

💯 I'd love to be able to get rid of the trait altogether. I'm not really worried about object-safety though, if we do need to keep using traits. The main question here is if we are using traits, do we need to constraint async fn's with a Send bound on their return type, and I think the answer there is no for now. We can always add a separate trait that adds Send, but we can't really take a Send bound off.

heaths commented 1 month ago

Won't that require us to use the 2024 edition instead of 2021 like we do now?

No, it doesn't require edition 2024, but it does require rustc 1.81 and our current MSRV is 1.76.

I don't think we're ready to make that jump just yet. It just came out, and even Go has an N-2 policy for language versions (~6 months) which move even slower than Rust's (~every month). At least, I don't think #[expect] is worth the bump. I chose 1.76 at the time because it was old enough and added support for async trait functions.

analogrelay commented 1 month ago

Absolutely agreed. It's good to know that we don't need to move editions though, so when we feel reasonable bumping MSRV to 1.81 we can use it fairly safely.