async and sync in the same program

sunfishcode commented 3 years ago

I maintain some libraries that have sync and async users. The async users wrap calls in spawn_blocking, which works, but isn't ergonomic and may be suboptimal. maybe_async looks like it could potentially be (part of) an alternative approach.

If I read the docs right, maybe_async can support both sync and async users, but by using a cargo feature, it can't support both in the same program. If the library gets used in multiple places in the same program, and one of them enables is_sync, it would break the other places that need async.

It seems like it would be useful to have a macro that expands to both sync and async, using different names, so that they can coexist. For example,

#[maybe_async::both]
async fn foo() -> bool {
    true
}

might expand to:

#[cfg(feature = "async")]
async fn async_foo() -> bool {
    true
}
#[cfg(feature = "sync")]
fn sync_foo() -> bool {
    true
}

so that users can call whichever form they need by name (though I'm not attached to the particular naming convention here).

fMeow commented 3 years ago

Ruyt 1.51 introduce a new resolver in cargo, which will compile crate for each set of different feature gates on dependencies.

With this v2 resolver, this issue is automatically solved. Just add the following lines in your Cargo.toml:

[package]
resolver = "2"
# Or if you're using a workspace
[workspace]
resolver = "2"

Plus, it doesn't make sense to compile both async and sync in the same program. Why use a blocking code when you can use async version?

scouten commented 3 years ago

Building on what @sunfishcode wrote, I have a trait in one of my crates (sorry, source is not public at this time) that is used in another of my programs in both sync (local file system) and async (contacting a remote service) contexts. For the local file system part of my program, I'd really rather not take on the complexity of async.

Please reopen and reconsider.

fMeow commented 3 years ago

I think the new resolver can help when one of dependencies use a different feature(blocking or async). I haven't tested it yet, but I think it should work.

And regarding the case of providing async and blocking at the same time, I don't really get it. When the program is able to run async code, it is natural to stick with async implementation. And when async executor is not available, it's ok to use blocking version as a fallback.

I have actually tried writing a [new procedural macro] (https://github.com/fMeow/amphi) to duplicate the code turned the copied ones to blocking code on another module. But my implementation make it really hard to maintain and debug. It deprives compiler from the ability to tell us which lines goes wrong.

marioortizmanero commented 3 years ago

Unfortunately no. The recommendation in the documentation is wrong:

RECOMMENDATION: Enable resolver ver2 in your crate, which is introduced in Rust 1.51. If not, two crates in dependency with conflict version (one async and another blocking) can fail complilation.

Because the resolver v2 only fixes the conflicts in very specific cases (https://doc.rust-lang.org/cargo/reference/resolver.html#feature-resolver-version-2), none of which match this one:

Features for target-specific dependencies are not enabled if the target is not currently being built.
Features enabled on build-dependencies or proc-macros will not be unified when those same dependencies are used as a normal dependency.
Features enabled on dev-dependencies will not be unified when those same dependencies are used as a normal dependency, unless those dev-dependencies are currently being built.

I've also made a repo to try it myself: https://github.com/marioortizmanero/resolver-v2-conflict

We're trying to fix this at rspotify (https://github.com/ramsayleung/rspotify/pull/222) and we've ended up at the same conclusion. maybe_async should work with both sync and async.

marioortizmanero commented 3 years ago

Also, I think it might be best to expand with modules:

#[maybe_async::both]
async fn foo() -> bool {
    true
}

to this:

#[cfg(feature = "async")]
mod impl_async {
    pub async fn foo() -> bool {
        true
    }
}

#[cfg(feature = "sync")]
mod impl_sync {
    pub fn foo() -> bool {
        true
    }
}

So that it's just easier to use and more flexible overall. Though the only way I can think of to make it work without duplicate module names is something like:

maybe_async! { // should only be called once per namespace
    async fn a() {}
    async fn b() {}
}

Which would generate:

mod impl_sync {
    fn a() {}
    fn b() {}
}
mod impl_async {
    async fn a() {}
    async fn b() {}
}

At the cost of slightly changing the semantics of this macro. Not sure if it's worth it.

marioortizmanero commented 3 years ago

It should also be the default behavior because the current one is just wrong; no need to use maybe_async::both. I will try to see if I can implement this myself in https://github.com/marioortizmanero/maybe-async-rs (but I don't promise anything, I'll be leaving in a couple days)

marioortizmanero commented 3 years ago

After attempting to implement this I've come to the conclusion that it's far from trivial (I kinda saw it coming).

maybe_async should work for structs as well. Sometimes you can't just add _sync/_async to every single method in your struct for usability reasons. It might be wiser to just have ThingSync::method over Thing::method_sync at some point.

Two traits would be generated now, like TraitSync and TraitAsync, which means errors about duplicate trait methods available, because both of them expose the same function names. Try to compile this example yourself: https://github.com/marioortizmanero/maybe-async-rs/blob/main/examples/service_client.rs

error[E0034]: multiple applicable items in scope
--> examples/service_client.rs:55:15
 |
55 |         Self::post("http://correct_url4create", String::from("my_bucket")).await
 |               ^^^^ multiple `post` found
 |
note: candidate #1 is defined in an impl of the trait `InnerClientSync` for the type `ServiceClient`
--> examples/service_client.rs:16:11
 |
16 |     async fn post(url: Url, data: String) -> Response {
 |           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
note: candidate #2 is defined in an impl of the trait `InnerClientAsync` for the type `ServiceClient`
--> examples/service_client.rs:16:5
 |
16 |     async fn post(url: Url, data: String) -> Response {
 |     ^^^^^^^^^^^^^
help: disambiguate the associated function for candidate #1
 |
55 |         InnerClientSync::post("http://correct_url4create", String::from("my_bucket")).await
 |         ^^^^^^^^^^^^^^^^^^^^^
help: disambiguate the associated function for candidate #2
 |
55 |         InnerClientAsync::post("http://correct_url4create", String::from("my_bucket")).await
 |         ^^^^^^^^^^^^^^^^^^^^^^

The only solution I can come up with is to have the macro replace Self, StructName and TraitName for TraitNameSync/TraitNameAsync. Same for self, but it would have to be treated differently and converted to TraitNameSync::method(self, ...), for example.

Thing is, even this isn't easy because self/Self doesn't necessarily mean it's calling a method from the trait. So the user would be forced to explicitly use TraitName::method(self, ...) instead of self.method(...). That way the macro can easily replace it with its sync/async variant. Problem: that looks terrible.

TLDR: processing functions in the macro works well because it's as easy as adding the suffix _async/_sync, but when it comes to traits it gets complicated because of conflicts.

So yeah, I'm stuck. The only way to do this may be appending _sync or _async.

marioortizmanero commented 3 years ago

@fMeow, can you add a warning about this library to the docs until this is worked on?

fMeow commented 3 years ago

Sure, I will update the doc in both readme and documentation in crate.io in the next minor release.

marioortizmanero commented 3 years ago

Update: we can actually do the duplicate traits thing if we also duplicate the struct that implements it. So instead of having a single struct that implements both the async and the sync traits, which is ambiguous, we have two structs, one for each trait.

I've got it working more or less in my fork, but I only modified the both macro (previously maybe_async) and it's not complete. It's enough to get these examples to compile:

This is how maybe_async would be used: https://github.com/marioortizmanero/maybe-async-rs/blob/main/examples/service_client.rs. It compiles with any combination of is_sync and is_async.
What it would expand to when the user enables both sync and async: https://github.com/marioortizmanero/maybe-async-rs/blob/main/examples/service_client_both.rs (You can see what's actually automatically generated by running cargo expand --example service_client --all-features > expanded.rs)

So basically, maybe_async::both will now work for:

Functions: appending _sync/_async
Structs and enums: appending Sync/Async
Traits: appending Sync/Async
Structs and enums impl blocks: appending Sync/Async
Trait impl blocks: appending Sync/Async to the trait name, to the self type (the implementor), or to both. This could be configured with parameters.

We could later add parameters to configure the suffix (suffix = "_Async"), or to configure the entire name (async_name = "Stream", sync_name = "Iter").

What do you think? Do you like it? Is it enough for everyone's use cases? I know it would for rspotify myself.

jorgecarleitao commented 2 years ago

Thanks a lot for this library. Quite cool idea!

Plus, it doesn't make sense to compile both async and sync in the same program. Why use a blocking code when you can use async version?

imo that is too broad: there are valid use-cases to offer both versions.

E.g. 1: a user of a library may need to consume data from two databases on which one offers a native async API while the other offers a sync version. E.g. the async is a remote db (e.g. postgres) while the sync is a local one (e.g. sqllite). The fs may be sufficiently fast for a Read to be ok.

E.g. 2: when loading data from large files that still fit in memory, it is almost always faster to download the file async to Vec<u8> and then parse it sync, than to read and parse it async. This is because many encoding that read byte by byte (such as zigzag) are faster in their sync versions.

Imo these decisions could be left to the users of this library.

jorgecarleitao commented 2 years ago

I think it would be fine to append _async and _sync: we can always keep them private and wrap the functions on public versions with the namings we want. The hard part is still valuable because of the DRY of the code inside the body; the API is just one-liners :P

marioortizmanero commented 2 years ago

This is super interesting and might actually fix our issue! https://blog.rust-lang.org/inside-rust/2022/07/27/keyword-generics.html

fMeow commented 2 years ago

I have a experimental inplementation to enable both async and sync, where I try to add a proc macro on a rust module to compile the underlying module to two modules, one for synchronous code and another for async. I have got an working implementation a long time ago, but I never publish it because error messages from rustc is quite messy. I wonder there is some new feature in rust that can tackle the messy error message issue. Might look into it this week.

Boscop commented 1 year ago

Is there an update on this? :) It would be really useful to have both versions in one crate..

janosimas commented 1 year ago

Going back to the other suggestion, why not use the is_sync/is_async module?

I think that would be more in the direction of the common pattern of having a blocking module. E.g. reqwest.

scouten-adobe commented 1 year ago

FWIW I wrote a crate that addresses exactly (and only) this problem: https://crates.io/crates/async-generic

janosimas commented 1 year ago

@scouten-adobe That looks really cool!

But it does not solve the problem of having a single implementation, right?

I mean, I think the ideal solution would be implement once and generate a sync and an async version. Also, your crate seems to be incompatible with maybe_async, is that right?

scouten commented 1 year ago

@janosimas the intent of async-generic is exactly what you stated: You write one function definition and the macro expands that into two functions: one sync and one async. I'm a little surprised to see that this intent wasn't communicated clearly and I'd welcome suggestions or edits to the readme that would make that intent more obvious.

There's no intentional incompatibility with maybe_async, but I'm attempting to solve a different problem.

scouten commented 1 year ago

(Note: @scouten-adobe and I are the same person, just writing from work or personal computers.)

janosimas commented 1 year ago

@scouten maybe I was not clear with what I meant.

Seems to me that to use async-generic, I have to "tag" async or sync parts of the code. From what I understood, you still have to write what should be sync or async.

The incompatibility is that if I mark a function with maybe_async, your crate is still restricted by the maybe_async version that was compiled. It cannot use maybe_async to generate the a/sync functions with async-generic.

I can open a discussion in async-generic if you prefer to continue the discussion there.

scouten commented 1 year ago

@janosimas I'd welcome any suggestions on how to address/resolve if you have any. (Maybe those should be filed as an issue over in async-generic so we don't spam this issue.)

janosimas commented 9 months ago

@scouten Sorry, I got busy and didn't follow up.

A colleague came with a solution using the duplicate and the maybe_async crate:

#[duplicate_item(
  module_type           maybe_async_attr   some_other_type;
  [ non_blocking ]     [ must_be_async ]    [ AsyncType ];
  [ blocking ]         [ must_be_sync ]     [ Blockingtype ];
)]
mod module_type {
    #[maybe_async_attr]
    pub async fn foo(input: some_other_type) -> ... {
    ...
    }
}

fMeow / maybe-async-rs

async and sync in the same program #6