Closed einarmo closed 1 month ago
It doesn't look to me like there is any mechanism to recover after a loss of connection at all? Presumably this happening hits here:
but there is no mechanism to communicate back to the client that the channel was reopened, from what I can tell, so the client would only be reactivated when using the SessionReconnectHandler
later on. Should there be a mechanism to let the channel inform the client that it must reactivate the session? The current reconnect mechanism in the channel implementation seems to be somewhere between useless and actively harmful.
I also added some Logs for the Client in the related #2544 that I created which is a duplicate of this one.
Hi @mregen, thank you for getting back to us on this! How do you see the way forward here – and would you have any estimates or updates?
If you could maybe point out some things where I could do some pre-work for you I'd like to try to help @mregen - currently I am on vacation but afterwards I'd happily try.
Hi @einarmo, @KircMax, @audunmidttunsystad, I was able to reproduce the issue with the sysinternals tcpview tool, close the connection and I see it happening. Thanks for reporting this special case!
I think currently of multiple ways to fix this:
i) when we go into reconnect and there is no valid service call in the beginning (e.g. CreateSession, ActivateSession, GetEndpoints etc.) fail the service calls which are currently queued with BadConnectionClosed
etc. Proceed only to TCP BeginConnect once a compliant service call is made.
ii) when we go into reconnect and there is no valid service call in the beginning (e.g. CreateSession, ActivateSession, GetEndpoints etc.) queue the service calls. If there is a valid service call, queue it in the beginning and start with TCP BeginConnect. If the connect attempt times out, all queued servcie calls time out.
I think the latter option might be smoother if there is really just a small network interruption. Problem in both cases is that the reconnect handler must trigger a reconnect to get the Activate message sent and queued before the TCP BeginConnect starts.
So better were the reconnect handler is immediately triggered when the socket closes, it may also need some additional wiring. Looking into the fix now...
Hi @mregen Thanks a lot for tackling this issue - it will be really helpful for short network interruptions Or - to give you another direct use case - when e.g. a failover of a redundant system takes place ;).
Both ways seem like valid fixes to me - From my side I'd leave this up to you to decide, you clearly have more experience and knowledge of the codebase. Thanks again.
Type of issue
Current Behavior
If the TCP connection is unexpectedly killed, the client will open a new secure channel with
SecureChannelId
0, but it will not follow that up withActivateSession
. From what I can tell this is in violation of the standard.What will eventually happen is that the new conection will fail, and the client will open another new secure channel where it will attempt to reactivate the session, this will only happen after keep alive timed out on the half-opened session.
Expected Behavior
The client should always reactivate sessions when creating a new secure channel before sending normal OPC UA requests on the new channel.
Steps To Reproduce
This is reproducible with a local server and a local client, including the sample client/server.
tcpkill
or a similar tool to sendRST
packets on the open connection, using the outgoing port for the client as the target, so that you only kill one connection, and not any subsequent connections the client creates.Environment
Anything else?
Here's a screenshot from wireshark:
Note how it takes several seconds after opening a new channel before the client discovers that the channel is bad and recreates it.