tokio-rs / tokio

A runtime for writing reliable asynchronous applications with Rust. Provides I/O, networking, scheduling, timers, ...
https://tokio.rs
MIT License
27.23k stars 2.5k forks source link

external executor block_on and yield defer do not work together #6996

Closed quininer closed 18 hours ago

quininer commented 4 days ago

Version

└── tokio v1.41.1
    └── tokio-macros v2.4.0 (proc-macro)

Platform

Linux hay 6.11.8-arch1-2 #1 SMP PREEMPT_DYNAMIC Fri, 15 Nov 2024 15:35:07 +0000 x86_64 GNU/Linux

Description

when yield_now().await is called in an external executor block_on, the wakeup maybe lost due to enter defer logic.

I tried this code:

https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=c12f5f957e5a1d33c4184d52b07e40b5

use std::sync::LazyLock;
use tokio::runtime;

const GLOBAL: LazyLock<runtime::Runtime> = LazyLock::new(|| {
    runtime::Builder::new_multi_thread().build().unwrap()
});

async fn task() {
    println!("start");
    tokio::task::yield_now().await;
    println!("done");
}

fn yield_no_back() {
    let j = GLOBAL.spawn(async {
        futures_executor::block_on(task());
    });
    GLOBAL.block_on(j).unwrap();
}

fn main() {
    yield_no_back();
}

I also tried block_on with block_in_place, but this still results in lost wakeups because block_in_place only exits runtime but not scheduler.

https://github.com/tokio-rs/tokio/blob/tokio-1.41.1/tokio/src/runtime/context/runtime_mt.rs#L29 https://github.com/tokio-rs/tokio/blob/tokio-1.41.1/tokio/src/runtime/context.rs#L184

I expected to see this happen: print start and done.

Instead, this happened: prints start and then hangs.

nurmohammed840 commented 4 days ago

That's unfortunate, tokio::task::yield_now() registers the waker to the defer list. I wonder if there is a way to clear the Tokio thread context for a specific scope.

This isn't a solution, but it might be a good idea to avoid using tokio::task::yield_now() within another executor (in scenarios where Tokio context exists) for now.

quininer commented 4 days ago

avoid using tokio::task::yield_now() within another executor (in scenarios where Tokio context exists) for now.

This is almost impossible and unreasonable because yield_now may appear in third-party dependencies.

make block_in_place clear the scheduler context is the simplest solution I can think of.

The real problem here is that context passed using thread_local does not reflect current executor. maybe access the tokio context through waker vtable and data is a better approach.

    #[track_caller]
    pub(crate) fn defer(waker: &Waker) {
        if ptr::eq(waker.vtable(), WAKER_VTABLE_REF) {
            unsafe {
                let ctx = waker.data().cast::<TokioWaker>().as_ref();
                ctx.scheduler.defer(waker);
            }
        } else {
            // Called from outside of the runtime, immediately wake the
            // task.
            waker.wake_by_ref();
        }
    }
nurmohammed840 commented 4 days ago

That might be a good solution, but it might take some time as .vtable() added recently (1.83)

Another approach could involve having Tokio remove the defer list from the caller's side. This might require some internal code changes, so I don’t think it’s something that would happen anytime soon.

Darksonn commented 3 days ago

Using an external executor's block_on within Tokio requires use of block_in_place.

That said, you say that it doesn't work even with block_in_place? If so, then that sounds like a bug.