rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
97.34k stars 12.59k forks source link

async lambda fails to copy a value that is Copy #127019

Open ifsheldon opened 3 months ago

ifsheldon commented 3 months ago

I tried this code:


async fn sum(idx: usize, num: u8) {
    let sum = idx + num as usize;
    println!("Sum = {}", sum);
}

async fn test() {
    let results = futures::future::join_all(
        (0..10).into_iter()
            .enumerate()
            .map(|(idx, num)| {
                async {
                    sum(idx, num).await;
                }
            })
    ).await;
    dbg!(results);
}

I expected to see this get compiled, but it gave me

error[E0373]: async block may outlive the current function, but it borrows `num`, which is owned by the current function
   --> src/main.rs:146:17
    |
146 | /                 async {
147 | |                     sum(idx, num).await;
    | |                              --- `num` is borrowed here
148 | |                 }
    | |_________________^ may outlive borrowed value `num`
    |
note: async block is returned here
   --> src/main.rs:146:17
    |
146 | /                 async {
147 | |                     sum(idx, num).await;
148 | |                 }
    | |_________________^
help: to force the async block to take ownership of `num` (and any other referenced variables), use the `move` keyword
    |
146 |                 async move {
    |                       ++++

The suggestion is plausible here, but in my actual code, I cannot easily add move because it also moves other values that I can only borrow and thus breaks my code.

The error makes sense in someway but I think rustc can easily copy num for me since it's just plain old data that is Copy.

I also tried other simple numeric types but they didn't work either.

Interestingly, rustc didn't raise an error for idx even when num is also usize, so I think the error is not self-consistent.

When searching issues, #127012 seems very similar to the code here, but I don't know whether they correlate.

Meta

rustc --version --verbose:

rustc 1.79.0 (129f3b996 2024-06-10)
binary: rustc
commit-hash: 129f3b9964af4d4a709d1383930ade12dfe7c081
commit-date: 2024-06-10
host: aarch64-apple-darwin
release: 1.79.0
LLVM version: 18.1.7

Updates

Update 0629: A more complete version that is closer to my actual code and cannot simply be resolved by adding move is

use std::time::Duration;
use tokio::time::sleep;

async fn sum(idx: usize, num: u8) -> usize {
    // the code logic is a mock for actual computation and
    // extracted from the closure to inform rustc
    // that I need idx and num as values so that it should just copy them for me
    idx + num as usize
}

async fn logging_mock(result: usize, id: usize) {
    println!("Result={}, id = {}", result, id);
}

async fn mock_request(result: usize, request_string: String, request_configs: &str) -> Result<(), String> {
    // request fn consumes `request_string` and `result` and references `request_configs`
    dbg!("Requesting {} with {} and {}", result, request_string, request_configs);
    Ok(())
}

#[derive(Debug)]
struct Configs {
    request_template: String,
    super_large_config: String,
    some_other_data: String,
}

#[tokio::main]
async fn main() {
    let mut configs = Configs {
        request_template: "hello".to_string(),
        super_large_config: "world".to_string(),
        some_other_data: String::new(),
    };
    let results = futures::future::join_all(
        (0..10).into_iter()
            .enumerate()
            .map(|(idx, num)| {
                async {
                    let s = sum(idx, num).await;
                    // comment out the above line and simple mocks below make the code compiled
                    // let s = 1;
                    // let idx = 1;
                    let template = configs.request_template.clone();
                    mock_request(s, template, &configs.super_large_config).await.unwrap();
                    // non-emergent logging, prevents accidental DoS attack
                    sleep(Duration::from_millis(idx as u64)).await;
                    logging_mock(s, idx).await;
                }
            })
    ).await;
    configs.some_other_data.push_str("do something to configs");
    dbg!(configs);
    dbg!(results);
}
theemathas commented 3 months ago

You may move some things and reference some things by creating a variable that's a reference and then moving that. For example:

async fn sum(idx: usize, num: u8) {
    let sum = idx + num as usize;
    println!("Sum = {}", sum);
}

async fn test() {
    let other_value = 1;
    let other_value_ref = &other_value;
    let results = futures::future::join_all(
        (0..10).into_iter()
            .enumerate()
            .map(|(idx, num)| {
                async move {
                    sum(idx, num + other_value_ref).await;
                }
            })
    ).await;
    dbg!(results);
}
ifsheldon commented 3 months ago

@theemathas thanks for the reply. (No offense) I honestly think that reference in your code is a bit awkward, because we need to do extra work that is conceptually a no-op.

I actually just tested my example with nightly rustc. It gave a more self-consistent error by saying that both idx and num need to be moved.

error[E0373]: async block may outlive the current function, but it borrows `num`, which is owned by the current function
  --> src/main.rs:12:17
   |
12 | /                 async {
13 | |                     sum(idx, num).await;
   | |                              --- `num` is borrowed here
14 | |                 }
   | |_________________^ may outlive borrowed value `num`
   |
note: async block is returned here
  --> src/main.rs:12:17
   |
12 | /                 async {
13 | |                     sum(idx, num).await;
14 | |                 }
   | |_________________^
help: to force the async block to take ownership of `num` (and any other referenced variables), use the `move` keyword
   |
12 |                 async move {
   |                       ++++

error[E0373]: async block may outlive the current function, but it borrows `idx`, which is owned by the current function
  --> src/main.rs:12:17
   |
12 | /                 async {
13 | |                     sum(idx, num).await;
   | |                         --- `idx` is borrowed here
14 | |                 }
   | |_________________^ may outlive borrowed value `idx`
   |
note: async block is returned here
  --> src/main.rs:12:17
   |
12 | /                 async {
13 | |                     sum(idx, num).await;
14 | |                 }
   | |_________________^
help: to force the async block to take ownership of `idx` (and any other referenced variables), use the `move` keyword
   |
12 |                 async move {
   |                       ++++

For more information about this error, try `rustc --explain E0373`.

My rustc is

rustc --version --verbose
rustc 1.81.0-nightly (4bc39f028 2024-06-26)
binary: rustc
commit-hash: 4bc39f028d14c24b04dd17dc425432c6ec354536
commit-date: 2024-06-26
host: aarch64-apple-darwin
release: 1.81.0-nightly
LLVM version: 18.1.7

I also tried your code with this nightly rustc. It turns out the same compile error.

fmease commented 3 months ago

Compiles if changed to

|…| async { … }async |…| … (https://hackmd.io/@compiler-errors/async-closures).

ifsheldon commented 3 months ago

@fmease thanks!

.map(|(idx, num)| sum(idx, num))

This works for this example which I intentionally reduced a lot. My actual code is a bit complicated. And I think it does not generalize to all use cases of closures.

.map(async |(idx, num)| sum(idx, num).await) under experimental feature async_closure

|…| async { … } ≠ async |…| … (https://hackmd.io/@compiler-errors/async-closures).

Yes, the blog post covers a more generalized and complicated scenarios in which the data may not be a plain old number. But I still don't know why rustc can't just copy that plain old data for me. I think my intent expressed in the code should be clear to the compiler and I think it's totally safe and sound to do that.

fmease commented 3 months ago

@ifsheldon Right, makes sense. I can't answer your more general question right now about capturing copyables by move by default (I'm short on time). Have you tried |…| async move { … } in your actual project? That fixes the code you posted at least and it's stable.

ifsheldon commented 3 months ago

@fmease

Have you tried |…| async move { … }

OK, I should not have reduced my code too much as I thought other details were not relevant. A more complete yet reduced code look like this.

use std::time::Duration;
use tokio::time::sleep;

async fn sum(idx: usize, num: u8) -> usize {
    // the code logic is a mock for actual computation and
    // extracted from the closure to inform rustc
    // that I need idx and num as values so that it should just copy them for me
    idx + num as usize
}

async fn logging_mock(result: usize, id: usize) {
    println!("Result={}, id = {}", result, id);
}

async fn mock_request(result: usize, request_string: String, request_configs: &str) -> Result<(), String> {
    // request fn consumes `request_string` and `result` and references `request_configs` that is super large and I would rather not clone
    dbg!("Requesting {} with {} and {}", result, request_string, request_configs);
    Ok(())
}

#[derive(Debug)]
struct Configs {
    request_template: String,
    super_large_config: String,
    some_other_data: String,
}

#[tokio::main]
async fn main() {
    let mut configs = Configs {
        request_template: "hello".to_string(),
        super_large_config: "world".to_string(),
        some_other_data: String::new(),
    };
    let results = futures::future::join_all(
        (0..10).into_iter()
            .enumerate()
            .map(|(idx, num)| {
                async {
                    let s = sum(idx, num).await;
                    // comment out the above line and simple mocks below make the code compiled
                    // let s = 1;
                    // let idx = 1;
                    let template = configs.request_template.clone();
                    mock_request(s, template, &configs.super_large_config).await.unwrap();
                    // non-emergent logging, prevents accidental DoS attack
                    sleep(Duration::from_millis(idx as u64)).await;
                    logging_mock(s, idx).await;
                }
            })
    ).await;
    configs.some_other_data.push_str("do something to configs");
    dbg!(configs);
    dbg!(results);
}

Yes I can workaround this with more code refactoring (and I did), but I think focusing solving this case in this code is a bit too restricted and shortsighted. This issue is not about how to workaround a problem but why this problem (rustc can't copy a Copyable for me when using async closures) happens.

compiler-errors commented 3 months ago

You should probably use async move {} like @fmease suggested, but you need to take a reference of &config manually to move it into the async block:

https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=a13f05679fdcdb977efd4ca8c630ed5f

ctrl+f the line // !!!

compiler-errors commented 3 months ago

Also ironically the code ICEs with async closures (async || { ... }). I'll fix that 🫡

ifsheldon commented 3 months ago

@compiler-errors Respect 🫡

but I've updated nightly just now. My code seems still not to pass rustc check.

My nightly is

rustc --version --verbose
rustc 1.81.0-nightly (6292b2af6 2024-07-02)
binary: rustc
commit-hash: 6292b2af620dbd771ebb687c3a93c69ba8f97268
commit-date: 2024-07-02
host: aarch64-apple-darwin
release: 1.81.0-nightly
LLVM version: 18.1.7

which should include #127244.

Yet the problem remains

error[E0373]: async block may outlive the current function, but it borrows `num`, which is owned by the current function
  --> src/main.rs:39:17
   |
39 |                 async {
   |                 ^^^^^ may outlive borrowed value `num`
40 |                     let s = sum(idx, num).await;
   |                                      --- `num` is borrowed here
   |
note: async block is returned here
  --> src/main.rs:39:17
   |
39 | /                 async {
40 | |                     let s = sum(idx, num).await;
41 | |                     // comment out the above line and simple mocks below make the code compiled
42 | |                     // let s = 1;
...  |
48 | |                     logging_mock(s, idx).await;
49 | |                 }
   | |_________________^
help: to force the async block to take ownership of `num` (and any other referenced variables), use the `move` keyword
   |
39 |                 async move {
   |                       ++++

error[E0373]: async block may outlive the current function, but it borrows `idx`, which is owned by the current function
  --> src/main.rs:39:17
   |
39 |                 async {
   |                 ^^^^^ may outlive borrowed value `idx`
40 |                     let s = sum(idx, num).await;
   |                                 --- `idx` is borrowed here
   |
note: async block is returned here
  --> src/main.rs:39:17
   |
39 | /                 async {
40 | |                     let s = sum(idx, num).await;
41 | |                     // comment out the above line and simple mocks below make the code compiled
42 | |                     // let s = 1;
...  |
48 | |                     logging_mock(s, idx).await;
49 | |                 }
   | |_________________^
help: to force the async block to take ownership of `idx` (and any other referenced variables), use the `move` keyword
   |
39 |                 async move {
   |                       ++++

For more information about this error, try `rustc --explain E0373`.
error: could not compile `rustc_bug` (bin "rustc_bug") due to 2 previous errors
compiler-errors commented 3 months ago

Oh yeah, sorry,I should've probably not marked this issue as "fixed" since it's only fixed by using async closures: https://play.rust-lang.org/?version=nightly&mode=debug&edition=2021&gist=418d9b8f3162a865b57d725c43458ad8 (which still ICEs since it's on yesterday's nightly lol)

Notice that I've turned || async {} into async || {}.

The only thing I fixed with that PR was async closures ICEing.

compiler-errors commented 3 months ago

But I still don't know why rustc can't just copy that plain old data for me. I think my intent expressed in the code should be clear to the compiler and I think it's totally safe and sound to do that.

The compiler is necessarily conservative in what it chooses to capture by-ref or by-move. It turns out that unless you write move, the compiler will always capture a reference to any data that is Copy. This is a quirk, but I believe it was chosen to avoid moving large Copyable data into closures by value when they can be copied at the point in the closure that would've moved the data.

This is unfortunately an observable detail in the closure capture analysis algorithm, so I don't think it's possible to change. So you're just gonna have to write async move to force the compiler to do what you want it to (like I had demonstrated above). Sorry! Hopefully this will become easier to write when async closures land.

ifsheldon commented 3 months ago

I believe it was chosen to avoid moving large Copyable data into closures by value when they can be copied at the point in the closure that would've moved the data.

OK, fair enough, but does that means we need to wait for something like Claim to clear the concern of moving large Copyable data before actually fixing this?

Or, can we make a little exception for builtin numeric types like i*, u*, f*?