Async fn doubles argument size

4teap commented 5 years ago

Generator optimization in #60187 by @tmandry reused generator locals, but arguments are still duplicated whenever used across yield points.

For example (playground):

#![feature(async_await)]

async fn wait() {}

async fn test(arg: [u8; 8192]) {
    wait().await;
    drop(arg);
}

fn main() {
    println!("{}", std::mem::size_of_val(&test([0; 8192])));
}

Expected: 8200 Actual: 16392

When passing in futures, the future size can grow exponentially (playground):

#![feature(async_await)]

async fn test(_arg: [u8; 8192]) {}

async fn use_future(fut: impl std::future::Future<Output = ()>) {
    fut.await
}

fn main() {
    println!(
        "{}",
        std::mem::size_of_val(&use_future(use_future(use_future(use_future(use_future(
            use_future(use_future(use_future(use_future(use_future(test(
                [0; 8192]
            ))))))
        ))))))
    );
}

Expected: 8236 Actual: 8396796

I didn't find any note on this. But given how common arguments are used, I think it might be useful if they are included in the optimization.

tmandry commented 5 years ago

Implementation notes: In the compiler, these arguments get converted to generator upvars, which are then immediately moved into locals at the top of the generator MIR. The values are afterward used from these local vars.

We don't support overlapping upvar storage. @cramertj had a suggestion which makes sense to me, which is to make upvars part of the generator Unresumed variant. Then when we move them into locals in the first resume, we'll be able to overlap those locals with the upvar storage. As long as we hold invariant that every upvar is moved into a local at the top of the generator resume function, this should always work.

I know there's some code which makes assumptions about where upvars live in closures and generators. I'm not sure how much will have to change if we do this.

cc @eddyb @mw @Zoxc

eddyb commented 5 years ago

@tmandry I think we can just add them as regular locals, and whether they will only be in Unresumed would depend on whether they are moved or not. Keep in mind you can do e.g. async fn async_drop<T>(_: T) {}.

nikomatsakis commented 5 years ago

Check-in from async-await wg meeting: marking as "deferred" as this does not block stabilization (though it'd still be great to fix!).

steveklabnik commented 4 years ago

Triage: first playpen now reports 16386, second reports 8390655

tmandry commented 4 years ago

In theory I think we can get away with not treating upvars specially, and remapping them to locals immediately, inside the generator transform.

We should remove prefix_tys and audit any uses of upvar_tys to make sure they don't make an assumption of where upvars are stored in the generator layout. In particular, the generator layout code will need to change so it doesn't handle upvars specially anymore (the upvar types should be included as field types in GeneratorLayout now).

There are probably some details I'm forgetting, but hopefully that's enough to get started.

pnkfelix commented 1 year ago

Okay, I have now prototyped a pair of simple MIR optimizations that, when applied prior to the generator layout pass in MIR, will deal with the examples shown on this issue. They are over in this branch here: https://github.com/pnkfelix/rust/tree/copy-prop-upvar-to-elim-local-for-62958

I will report back more tomorrow or Friday after I get a chance to run the optimizations through their paces against the Rust test suite. (And I will enlist @bryangarza 's help in coming up pathological tests.)

cjhopman commented 1 year ago

What surprised me about this is that these two produce differently sized futures:

async fn a<F: Future<Output=()>, G: Future<Output=()>>(f: F, g: G) -> () {
    f.await;
    g.await;
}

fn b<F: Future<Output=()>, G: Future<Output=()>>(f: F, g: G) -> impl Future<Output = ()> {
    async move {
        f.await;
        g.await;
    }
}

See: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=e2f68fb25d5bf33511a35b9c343efb28

pnkfelix commented 1 year ago

@cjhopman indeed, the example you give there shows one fundamental piece of the problem.

Its because this:

async fn a<F: Future<Output=()>, G: Future<Output=()>>(f: F, g: G) -> () {
    f.await;
    g.await;
}

desugars to:

fn a_desugared<F: Future<Output=()>, G: Future<Output=()>>(f: F, g: G) -> impl Future<Output = ()> {
    async move {
        let f = f;
        let g = g;
        f.await;
        g.await;
    }
}

(and then those let <local> = <upvar>; get compiled in a very naive way that causes the layouts to blow up.)

Its not the only problem here, but its an easy one to address in the compiler.

rust-lang / rust

Async fn doubles argument size #62958