locka99 / opcua

A client and server implementation of the OPC UA specification written in Rust
Mozilla Public License 2.0
501 stars 131 forks source link

Dropping the Client from within an async function panics #148

Open milgner opened 2 years ago

milgner commented 2 years ago

Using a pre-existing Tokio Runtime is already part of the README as a potential improvement - so it's probably not a surprise, but I nonetheless wanted to report this problem for completeness:

In our application we're dropping the OPC UA Client from within an async function. But since this will also drop the Tokio Runtime of the associated TCP Transport, it panics with the following error:

thread 'tokio-runtime-worker' panicked at 'Cannot drop a runtime in a context where blocking is not allowed. This happens when a runtime is dropped from within an asynchronous context.', tokio-1.12.0/src/runtime/blocking/shutdown.rs:51:21

For now this is not a blocker for us because the panic is handled gracefully but it would probably be a good idea to discuss alternatives. I haven't spent any time on concrete ideas but off-the-cuff, I see the following two approaches:

svanharmelen commented 2 years ago

Just FYI I'm actually working on a full async client. The first step for that is #146 but that still needs to be discussed to see if @locka99 is open to such a change.

If so, the rest of the work is quite easy and I suggest to (initially at least) add the async client next to the existing one and then over time try to converge them to one client that can serve all use cases.

locka99 commented 2 years ago

Is it not possible to wait for the client to be finished before dropping it?

I think the issue of async/sync needs to be looked at though. I've taken the view so far that a synchronous client API is more convenient for client side users who just want to connect and do stuff without the potential grief of asynchronous programming on top. But I'm open to ways of providing an async API and possibly a legacy compatible sync API that sits on top and does more or less what the existing API does right now.

I expect that would involve the sync thread and the async dispatcher sharing a channel and there being client api request / response structs being sent back and forth with some marshaling.

milgner commented 2 years ago

The problem is not that the client is still doing work but the fact that it holds its own Tokio runtime at all. Our application uses a global Tokio runtime to run all our code from within async contexts. Now, when stopping OPC UA client functionality, the Client will be Dropped, causing its Tokio Runtime to Drop, too, leading to the error in question.

As such, it would be great if there was a way to prevent additional instantiation of Tokio Runtimes as either part of the client or the TCP transport but instead have them either inject a reference to an existing runtime or have a fully async API with some sync layer around it it.

svanharmelen commented 2 years ago

The problem also exists without dropping the client when session.reconnect_and_activate() is called. That will drop and recreate the currently used TCP transport which also contains it's own runtime...

kate-goldenring commented 2 years ago

We ran into this issue with akri and were able to resolve it by ensuring the same runtime is used by wrapping the client creation in a spawn_blocking call rather than nesting the runtimes by accident. More details in case this helps others: https://github.com/project-akri/akri/pull/484