Closed SabrinaJewson closed 3 weeks ago
I apologize for not reading through #7 – I understand that spawning a task isn’t always desirable, because it can cause the quitting function not to finish.
Edit: I was mistaken – because dropping a runtime immediately shuts down I/O resources, the future you’re blocking on will just cancel.
My remaining suggestion is to use block_in_place
if the runtime is multi-threaded (which can be checked via runtime::Handle::runtime_flavor
), and simply leak otherwise. I think that in general, a leak is preferable to a deadlock.
no I do not in fact use any workarounds I wait in place, and I block in place if possible, I understand the confusion that you may have had when seeing block_on
but its an internal api and I also use it for tests and thats why I named it that 🙃, also the ONLY way this can cause deadlocks is if the HTTP client relies on a spawned task on the main thread which the default doesnt, but I should mention that in the docs you do make a point
from /src/support.rs
/// Helper to run the specified future and block the current thread waiting for the result.
/// works even while in a tokio runtime
pub fn block_on<F>(future: F) -> F::Output
where
F: Future + Send,
F::Output: Send,
{
fn no_unwind<T>(f: impl FnOnce() -> T) -> T {
let res = std::panic::catch_unwind(AssertUnwindSafe(f));
res.unwrap_or_else(|_| {
struct Abort;
impl Drop for Abort {
fn drop(&mut self) {
eprintln!("unrecoverable error reached aborting...");
std::process::abort()
}
}
let _abort_on_unwind = Abort;
unreachable!("thirtyfour global runtime panicked")
})
}
static GLOBAL_RT: LazyLock<tokio::runtime::Handle> = LazyLock::new(|| {
no_unwind(|| {
let rt = tokio::runtime::Builder::new_current_thread().enable_all().build().unwrap();
let handle = rt.handle().clone();
// drive the runtime
// we do this so that all calls to GLOBAL_RT.block_on() work
thread::spawn(move || -> ! {
async fn forever() -> ! {
match std::future::pending::<Infallible>().await {}
}
no_unwind(move || rt.block_on(forever()))
});
handle
})
});
macro_rules! block_global {
($future:expr) => {
thread::scope(|scope| match scope.spawn(|| GLOBAL_RT.block_on($future)).join() {
Ok(res) => res,
Err(panic) => std::panic::resume_unwind(panic),
})
};
}
cfg_if::cfg_if! {
if #[cfg(feature = "tokio-multi-threaded")] {
use tokio::runtime::RuntimeFlavor;
match tokio::runtime::Handle::try_current() {
Ok(handle) if handle.runtime_flavor() == RuntimeFlavor::MultiThread => {
tokio::task::block_in_place(|| handle.block_on(future))
}
_ => block_global!(future),
}
} else {
block_global!(future)
}
}
}
I see, thanks for the clarification. But that is still blocking in Drop
– if you’re goïng to be running a separate runtime anyway, maybe you could have a spawn_blocking
task that runs the code in the global Tokio runtime? That way, the main runtime will be unable to exit until that blocking task completes, but since the global runtime won’t have shut down yet, I/O resources are still able to be used.
Though on a side note, I’m rather surprised if that actually works – I don’t believe that in general I/O resources from one tokio
runtime can be used from another one like that. Doesn’t it panic?
maybe you could have a
spawn_blocking
so block on has the signature
pub fn block_on<F: Future>(&self, future: F) -> F::Output
pub fn spawn_blocking<F, R>(f: F) -> JoinHandle
where
F: FnOnce() + 'static,
R: Send + 'static,
note that JoinHandle has no way of blocking on to wait for it synchronously you can only await it that being said you can still work around this fact using a oneshot std channel but the biggest deal is it requires F to be static
Though on a side note, I’m rather surprised if that actually works – I don’t believe that in general I/O resources from one tokio runtime can be used from another one like that. Doesn’t it panic?
this is kinda complicated and gets into internals but the TLDR is it works, when you poll the future (after transfer) and since you didn't poll it before using a different runtime the Waker
that gets registered is the global runtime's and the future just uses the global runtimes resources no panic here
note that JoinHandle has no way of blocking on to wait for it synchronously you can only await it
The point is to throw away the join handle, becuase blocking in Drop
in a runtime without block_in_place
is a bug. The use of the spawn_blocking
task ensures the runtime does not actually exit until the closure is complete.
the ONLY way this can cause deadlocks is if the HTTP client relies on a spawned task on the main thread which the default doesnt
Is that true? The I/O resources used by the HTTP cliënt are still bound to the runtime, and if the runtime is blocked it can’t process events given by epoll, hence the HTTP connection cannot be driven forward. I was able to produce deadlocks with this code:
use thirtyfour::FirefoxCapabilities;
use thirtyfour::WebDriver;
#[tokio::main(flavor = "current_thread")]
async fn main() {
WebDriver::new("http://localhost:4444", FirefoxCapabilities::new())
.await
.unwrap();
}
this is kinda complicated and gets into internals but the TLDR is it works, when you poll the future (after transfer) and since you didn't poll it before using a different runtime the
Waker
that gets registered is the global runtime's and the future just uses the global runtimes resources no panic here
That’s not what I was asking. But I did some testing and the way it works is that while Tokio I/O resources cannot dynamically change the runtime they use internally, it’s perfectly okay to use the I/O resources bound to runtime A inside runtime B, so long as runtime A is not shut down.
Which is essentially to say: if this strategy works, you’re blocking the thread in an async context, which is wrong. That said, I’ve learnt something new about the situation, so let me update my suggestions:
First of all, in the multi-threaded case, block_in_place
is fine.
In the single-threaded case, blocking at all in Drop
is a bug. What can be done instead is to start a spawn_blocking
task in the Drop
implementation (and not wait for it to complete). This ensures that the outer runtime will not quit until the closure passed into spawn_blocking
completes. However, the I/O resources may have shut down in this time; so what we need to do is creäte a new runtime inside the spawn_blocking
call and creäte a new reqwest::Client
in this code path.
The purpose of making a wholly new reqwest::Client
is to ensure its I/O resources are bound to the new runtime, not the old runtime (that might be shutting down).
That’s not what I was asking. But I did some testing and the way it works is that while tokio I/O resources cannot dynamically change the runtime they use internally, it’s perfectly okay to use the I/O resources bound to runtime A inside runtime B, so long as runtime A is not shut down.
Yea, and again if nothing is bound yet it works out, as I said its complicated and really gets down to how the runtimes work
Which is essentially to say: if this strategy works, you’re blocking the thread in an async context, which is wrong.
In the single-threaded case, blocking at all in Drop is a bug.
this is just wrong, you can block any async executor any time any day in any case, because you do that if you do arithmetic wait on a synchronous mutex, or any operation that isn't the future returning Poll::Pending
, its just that in most cases you shouldn't block for a long time, like when you do fs access or network calls, it impacts performance a lot, but like this is just a web driver automation crate most users wouldn't care if they blocked a little bit when using a web driver
What can be done instead is to start a spawn_blocking task in the Drop implementation (and not wait for it to complete).
this code runs fine and finishes doesn't stop for ever, why? because the blocking task has not been scheduled yet, and guess what
#[tokio::main]
async fn main() {
tokio::task::spawn_blocking(move || {
std::thread::park();
});
}
and this is really common to be dropping the Webdriver at the end of ur program, so its hard to guarantee it got scheduled
and guess what doing something like this still blocks the runtime for an an unknown amount of time as we don't know when the blocking pool is going to be free
#[tokio::main]
async fn main() {
let (tx, rx) = std::sync::mpsc::sync_channel(0);
tokio::task::spawn_blocking(move || {
let _ = tx.send(());
/* rest of the code */
});
rx.recv().expect("could not spawn task");
}
And this doesn't even address the fact that the future has to be 'static
in short AsyncDrop
is complicated really, and just blocking on works perfectly fine I don't expect to be making LOTS of these, and I really don't expect a lot of people to not be closing them if they are making lots of web drivers, so as long as the HttpClient doesn't await something running on the main runtime, and btw things spawned using LocalSet or join!
in a multi-threaded runtime, which is a very uncommon thing to do in an http client, and isn't done by the default one it works
the only other option is to duplicate a lot of code and using the ureq crate for http for a truly sync drop, or using a janky solution that requires 2 Arc indirections, and a lot of refactoring to drop (no you cannot use the Weak
returned by Arc::new_cyclic
) while the object is being dropped upgrade will return None), that I guess usually blocks for a tiny bit less, but for what sake?? we change a lot of code and its impacted negatively
Is that true? The I/O resources used by the HTTP cliënt are still bound to the runtime, and if the runtime is blocked it can’t process events given by epoll, hence the HTTP connection cannot be driven forward. I was able to produce deadlocks with this code:
use thirtyfour::FirefoxCapabilities; use thirtyfour::WebDriver; #[tokio::main(flavor = "current_thread")] async fn main() { WebDriver::new("http://localhost:4444", FirefoxCapabilities::new()) .await .unwrap(); }
okay sheesh this is embarassing my bad bro 🫠 I didn't know it would block this doesn't work yeah never mind I have genuinely no idea how this wasn't caught in CI tests dont use quit most of the time
Related to #19. Currently, this library runs
block_on
inDrop
. This is incorrect and a bug, as it can cause deadlocks.Instead, the library should attempt to spawn a task so that
quit()
can run in the background. At least forWebDriver
itself, theSessionHandle
is in anArc
so this is possible.But even better than that would be to refactor
SessionHandle
to always contain an internalArc
:Store a
SessionHandle
instead of anArc<SessionHandle>
inWebDriver
. Inimpl Drop for SessionHandle
, it can spawn a task and notblock_on
.I see you’re using
futures::executor::block_on
to block inDrop
, and this is presumably to work around the panic you get in Tokio. But the panic exists for a reason, because it’s an indicator of behaviour that is always incorrect. Working around it wasfutures::executor::block_on
is like usingunsafe
to get around the limitation that borrowck won’t allow you to make two&mut
s to the same place. Granted, here it’s not insta-UB, but it’s always incorrect.