locka99 / opcua

A client and server implementation of the OPC UA specification written in Rust
Mozilla Public License 2.0
501 stars 131 forks source link

When a connection times out, it stays up, in a zombie state #199

Closed lovasoa closed 2 years ago

lovasoa commented 2 years ago

In the server, just opening a tcp connection and writing nothing to it results in the following :

[2022-05-03T09:34:06.677Z INFO  opcua::server::server] Handling new connection PollEvented { io: Some(TcpStream { addr: 127.0.0.1:4840, peer: 127.0.0.1:47544, fd: 14 }) }
[2022-05-03T09:34:06.678Z INFO  opcua::server::comms::tcp_transport] Socket info:
      Linger - No linger,
      TTL - 64
[2022-05-03T09:34:06.678Z INFO  opcua::server::comms::tcp_transport] Session started 2022-05-03 09:34:06.678170550 UTC
[2022-05-03T09:34:12.179Z INFO  opcua::server::comms::tcp_transport] Session has been waiting for a hello for more than the timeout period and will now close
[2022-05-03T09:34:12.179Z INFO  opcua::server::comms::tcp_transport] Hello timeout is finished
[2022-05-03T09:34:12.180Z ERROR opcua::server::comms::tcp_transport] Write bytes task is in error
[2022-05-03T09:34:12.280Z INFO  opcua::server::comms::tcp_transport] subscriptions_task loop connection finished
[2022-05-03T09:34:12.280Z INFO  opcua::server::comms::tcp_transport] Subscription monitor is finished
[2022-05-03T09:34:12.280Z INFO  opcua::server::comms::tcp_transport] subscriptions_task loop connection finished
[2022-05-03T09:34:12.280Z INFO  opcua::server::comms::tcp_transport] Subscription receiver is finished
[2022-05-03T09:34:12.679Z INFO  opcua::server::comms::tcp_transport] Finished monitor task is finished

The async read task keeps running, and the TCP connection stays open, even though it is in a non-recoverable broken state on the server.

Looking at the code on the server, it looks like the different tasks linked to a connection depend on a complex combination of message-passing and setting boolean variables to know when to stop, which seems not to be working in this case. Would you be open to a PR simplifying this and guaranteeing that tasks linked to a connection are always either all running or all stopped ?