Dropping the Client from within an async function panics

milgner commented 2 years ago

Using a pre-existing Tokio Runtime is already part of the README as a potential improvement - so it's probably not a surprise, but I nonetheless wanted to report this problem for completeness:

In our application we're dropping the OPC UA Client from within an async function. But since this will also drop the Tokio Runtime of the associated TCP Transport, it panics with the following error:

thread 'tokio-runtime-worker' panicked at 'Cannot drop a runtime in a context where blocking is not allowed. This happens when a runtime is dropped from within an asynchronous context.', tokio-1.12.0/src/runtime/blocking/shutdown.rs:51:21

For now this is not a blocker for us because the panic is handled gracefully but it would probably be a good idea to discuss alternatives. I haven't spent any time on concrete ideas but off-the-cuff, I see the following two approaches:

refactor all interfaces to async: it would probably be a lot of work (and require the async-trait crate, probably, too) and I'd understand if you'd like to avoid the associated complexity
allow library users to provide an async runner: it would basically just need to implement a function pub fn block_on<F: Future>(&self, future: F) -> F::Output just like the Tokio Runtime. This might even make it possible for users to use async_std::task::block_on if they don't want to use Tokio but async-std instead. Could be made backwards-compatible with a default constructor that creates a Tokio Runtime.

svanharmelen commented 2 years ago

Just FYI I'm actually working on a full async client. The first step for that is #146 but that still needs to be discussed to see if @locka99 is open to such a change.

If so, the rest of the work is quite easy and I suggest to (initially at least) add the async client next to the existing one and then over time try to converge them to one client that can serve all use cases.

locka99 commented 2 years ago

Is it not possible to wait for the client to be finished before dropping it?

I think the issue of async/sync needs to be looked at though. I've taken the view so far that a synchronous client API is more convenient for client side users who just want to connect and do stuff without the potential grief of asynchronous programming on top. But I'm open to ways of providing an async API and possibly a legacy compatible sync API that sits on top and does more or less what the existing API does right now.

I expect that would involve the sync thread and the async dispatcher sharing a channel and there being client api request / response structs being sent back and forth with some marshaling.

milgner commented 2 years ago

The problem is not that the client is still doing work but the fact that it holds its own Tokio runtime at all. Our application uses a global Tokio runtime to run all our code from within async contexts. Now, when stopping OPC UA client functionality, the Client will be Dropped, causing its Tokio Runtime to Drop, too, leading to the error in question.

As such, it would be great if there was a way to prevent additional instantiation of Tokio Runtimes as either part of the client or the TCP transport but instead have them either inject a reference to an existing runtime or have a fully async API with some sync layer around it it.

svanharmelen commented 2 years ago

The problem also exists without dropping the client when session.reconnect_and_activate() is called. That will drop and recreate the currently used TCP transport which also contains it's own runtime...

kate-goldenring commented 2 years ago

We ran into this issue with akri and were able to resolve it by ensuring the same runtime is used by wrapping the client creation in a spawn_blocking call rather than nesting the runtimes by accident. More details in case this helps others: https://github.com/project-akri/akri/pull/484

locka99 / opcua

Dropping the Client from within an async function panics #148