stepfunc / rodbus

Rust implementation of Modbus with idiomatic bindings for C, C++, .NET, and Java
https://stepfunc.io/products/libraries/modbus/
Other
80 stars 22 forks source link

disabling ClientChannel without waiting for pending requests #144

Open xlukem opened 2 days ago

xlukem commented 2 days ago

when communicating with one of our modbus server devices via a rodbus client, we noticed that this specific server device is unable to handle multiple modbus connections at once. this should not be an issue.

however, during an active connection via rodbus, rodbus does not seem to register a connection fault when the connection is interrupted due to a third device starting to communicate over modbus with the modbus server.

rodbus only reports response_timeouts as modbus requests wont get answered anymore while there is still a active tcp connection and modbus requests still get acknowledged by the server with a TCP ACK

now we wanted to resolve this issue by reconnecting the rodbus client by disabling and re-enabling the rodbus ClientChannel however this does not work as there are still requests piled up in the rodbus queue which seemed to needed to all be "timed out" before the client channel gets disabled

is there a way to either disable the ClientChannel directly or let all queued up requests fail at once?

also, unfortunately we are unable to change the behaviour of this specific modbus server device

jadamcrain commented 2 days ago

@xlukem Does the server not send a TCP FIN or RST? It just stops answering requests but keeps the connection open?

I agree that Rodbus should be able to enable/disable in this situation. I have a good idea of how this should be implemented on the main task loop.

That said, I wish there was a good way to detect this condition and gracefully handle the poor behavior from this device without the user (you) having to monitor for this condition and initiate an enable / disable. One potential solution would be for the main task loop to implement this logic, i.e. a have a "maximum number of request timeouts" parameter after which the current connection is dropped and a re-connection happens.

xlukem commented 2 days ago

there actually does seem to be a TCP RST frame.. however the destination port seems weird, i cant find this port again anywhere in the trace but yes, this seems to be an issue with the modbus server we are dealing with, it is generally very poorly designed

image

IP .204: modbus server IP .1: rodbus IP .44: third device interrupting connection

I agree that Rodbus should be able to enable/disable in this situation. I have a good idea of how this should be implemented on the main task loop.

awesome!

One potential solution would be for the main task loop to implement this logic, i.e. a have a "maximum number of request timeouts" parameter after which the current connection is dropped and a re-connection happens.

yes, thats what we currently try to do by keeping track of failed requests having this implemented by the library would be a nice addition

xlukem commented 2 days ago

another thing i have noticed is that rodbus does not report a connectivity problem when the ethernet connection is interrupted

image

here rodbus only reports single timeouts for each message sent until the requests queue is empty or the modbus server is connected again (and sends an RST) (also, this is another modbus server thats working more reliable than the IP .204)

would it be possible to implement a ClientChannel specific channel timeout?

jadamcrain commented 1 day ago

Not sure I quite understand what you mean by "when the ethernet connection is interrupted". Are you pulling the ethernet cable in this scenario or is there a network failure of sorts? Does Rodbus detect that the connection is down via the channel state callbacks or does it think that the connection is there, but the remote device just isn't responding?

xlukem commented 23 hours ago

yes, i interrupted the connection by pulling the ethernet cable in this scenario rodbus does not report any changes about the ClientState via the PrintingClientStateListener and the request callbacks only report a response_timeout