SICKAG / sick_safetyscanners

ROS driver for SICK safety laser scanners
https://www.sick.com/de/en/opto-electronic-protective-devices/safety-laser-scanners/c/g187225
Apache License 2.0
61 stars 59 forks source link

Handling of socket disconnection and failure to establish connection #122

Open smalik007 opened 1 year ago

smalik007 commented 1 year ago

Hi, The package does not support the socket connection failures, it simply just logs it and silently do nothing about it. It would be nice to have

  1. On initialization of the socket connection if it fails throw an exception from the package (with may be some retries) instead of just ROS_ERROR log.
  2. A way to monitor the connection is maintained (e.g if a TCP connection is disconnected), the package can go to indefinitely re-establishing the connection.
reinzor commented 1 year ago

IMO this is desired behavior. You can use the published diagnostics for monitoring the health of the driver.

smalik007 commented 1 year ago

@reinzor, Maybe I was not able to describe the problem correctly. The issue is if Scanners gets disconnected and reconnects back, the node becomes silent about it. It does not auto recover own its own. I get your point that we can use diagnostics for checking the health status, but my argument for it is that the node is not doing anything if the Lidar disconnect, It would be rather good if the node exit rather than being just idle. I can see there's been some discussion about this in one of the closed issue : https://github.com/SICKAG/sick_safetyscanners/pull/78, but it looks like it never made to the release because the implementation was a quick fix rather than proper handling as @puck-fzi mentions it can be handled properly with some error exceptions and multiple retries. Likewise, I agree with this approach and could spend some time on fixing that.

puck-fzi commented 1 year ago

Hi thanks for raising, as stated this is currently desired behavior. The branch with the quick fix is nothing which will be integrated into the main branch. This was just a quick way of exiting the node if the connection failed. If you are willing to spend time on this, the best scenario I reckon would be configurable amount of retries before the node shuts down with an error message.

smalik007 commented 1 year ago

Hi I want to start working on the issue and would like to discuss the approach before implementing and adding a PR here. The problem actually has two parts in it, One is TCP connection and second is the UDP connection. Currently, if on start of the node if Lidar is let say disconnected then the current behavior of the driver is to print TCP error as in the image below and wait indefinitely in sendTelegramAndListenForAnswer API.

Screenshot from 2023-05-24 11-18-16

We can fix this by simply changing the doConnect API and other send and receive APIs for TCP to returns booleans when they fail/print error of unsuccessful TCP connection, and using boolean return to execute further code snippet, thus making the node non-blocking on the above API and retry thedoConnect again instead.

For the second part of the problem where if we are using only UDP connection for data and no more TCP request after initialization we can put a watchdog monitoring on received packets on UDP handler, then If for a configurable amount of time no UDP packet has been received we change the node state to reconnection and publish reconnection state to a topic or existing driver health, where we try a TCP request with multiple or infinite retries, once we get a response after the Lidar is connected back, we re-establish the UDP connection part to start receiving the UDP data again. This will allow the node to self recover own its own instead of shutting down and restarting again.

Please feel free for the feedback in above approach so that I can start working on it.

smalik007 commented 1 year ago

PR submitted : https://github.com/SICKAG/sick_safetyscanners/pull/131

puck-fzi commented 1 year ago

Thanks for submitting the PR. I will check it and if everything seems to be working will merge it into the driver