bluerobotics / BlueOS

The open source platform for ROV, USV, robotic system operation, development, and expansion.
https://blueos.cloud/docs/
Other
159 stars 77 forks source link

bug: ping service: serial devices can be registered without a valid baudrate #1190

Open ES-Alexander opened 2 years ago

ES-Alexander commented 2 years ago

detect_highest_baud initialises the "last_valid_baud" to 115200, even though it hasn't yet been tested. That means if all bauds below that fail, then even if it fails to connect at 115200 baud it will still log as valid, and be used to register a new (non-functioning) device. Probably it should be initialised to None, and raise an exception if the last_valid_baud is still None after all intended baudrates have been tested.

Issue was discovered thanks to the log in this forum comment, whereby it starts out connecting with

2022-10-17 17:37:07.869 | INFO     | portwatcher:probe_port:50 - Probing port: fe201600.serial
2022-10-17 17:37:07.870 | INFO     | pingprober:probe:17 - Probing /dev/ttyAMA1 - ttyAMA1
2022-10-17 17:37:08.669 | INFO     | pingprober:legacy_detect_ping1d:48 - Identified ping device:
2022-10-17 17:37:08.669 | INFO     | pingprober:legacy_detect_ping1d:49 - PING1D
ID: 0
FW: v3.26.0
port: /sys/devices/platform/soc/fe201600.serial
2022-10-17 17:37:08.670 | INFO     | pingmanager:launch_driver_instance:46 - Launching ping1d driver
2022-10-17 17:37:08.685 | DEBUG    | commonwealth.settings.manager:__init__:33 - Starting ping settings with SettingsV1, configuration path: None
2022-10-17 17:37:08.687 | DEBUG    | commonwealth.settings.manager:load:115 - Found possible candidates for settings source: [PosixPath('/root/.config/ping/settings-1.json')]
2022-10-17 17:37:08.688 | DEBUG    | commonwealth.settings.manager:load:117 - Checking /root/.config/ping/settings-1.json for settings
2022-10-17 17:37:08.689 | DEBUG    | commonwealth.settings.settings:load:53 - Loading settings from file: /root/.config/ping/settings-1.json
2022-10-17 17:37:08.690 | DEBUG    | commonwealth.settings.manager:load:120 - Using /root/.config/ping/settings-1.json as settings source

Side note: I'm not sure if v3.26.0 firmware is too old to work properly or something, but if that is the case then that should be accounted for in our connection logic and logging.

then tries to establish a baud rate (at which point it is seemingly no longer working)

2022-10-17 17:37:09.911 | DEBUG    | pingdriver:detect_highest_baud:39 - Trying baud 9600...
2022-10-17 17:37:10.234 | DEBUG    | pingdriver:detect_highest_baud:51 - Baudrate 9600 is invalid
...
2022-10-17 17:37:10.876 | DEBUG    | pingdriver:detect_highest_baud:39 - Trying baud 57600...
2022-10-17 17:37:11.204 | DEBUG    | pingdriver:detect_highest_baud:51 - Baudrate 57600 is invalid
2022-10-17 17:37:11.206 | DEBUG    | pingdriver:detect_highest_baud:39 - Trying baud 115200...
2022-10-17 17:37:11.534 | DEBUG    | pingdriver:detect_highest_baud:51 - Baudrate 115200 is valid
2022-10-17 17:37:11.536 | DEBUG    | pingdriver:detect_highest_baud:39 - Trying baud 230400...
2022-10-17 17:37:11.896 | DEBUG    | pingdriver:detect_highest_baud:51 - Baudrate 230400 is invalid
...
2022-10-17 17:37:14.803 | DEBUG    | pingdriver:detect_highest_baud:39 - Trying baud 3000000...
2022-10-17 17:37:15.126 | DEBUG    | pingdriver:detect_highest_baud:51 - Baudrate 3000000 is invalid
2022-10-17 17:37:15.128 | INFO     | pingdriver:detect_highest_baud:52 - Highest baudrate detected: 115200

then proceeds to repeatedly

2022-10-17 17:37:16.478 | DEBUG    | ping1d_mavlink:send_ping1d_request:74 - requesting new data
2022-10-17 17:37:16.480 | INFO     | ping1d_mavlink:drive:121 - attempting reconnection...
2022-10-17 17:37:16.734 | DEBUG    | ping1d_mavlink:send_ping1d_request:74 - requesting new data
2022-10-17 17:37:16.841 | DEBUG    | ping1d_mavlink:send_ping1d_request:74 - requesting new data
2022-10-17 17:37:16.948 | DEBUG    | ping1d_mavlink:send_ping1d_request:74 - requesting new data
2022-10-17 17:37:17.055 | DEBUG    | ping1d_mavlink:send_ping1d_request:74 - requesting new data

with no send_distance_data logs in between.

For robustness, we should probably also keep track of repeated reconnect attempts with no data received in between, and possibly kick the device after a few attempts (or at least have some kind of logging + a frontend indication that the device is "there" but not responding as expected). Likely the simplest tracking approach would be to have a repeated_reconnect_counter = 0 before the loop, increment it every time a reconnect is attempted (and check for the threshold), and reset it to 0 every time a distance message is sent.

ES-Alexander commented 2 years ago

I've just remembered that prior to v3.28.0 firmwares were hardcoded to a particular baudrate, in which case legacy_ping1d_detect should probably also check 9600 baud, not just 115200. It's also possible the attempts to auto-detect maximum baudrate could be messing up the ability for the device to communicate properly when on older firmwares - they probably shouldn't be checked with other options.

Williangalvani commented 2 years ago

This is weird. the ping seemingly responds with the device ID, ~then responds to the 115200 baud. then we can't talk to it anymore...~ Nevermind, I see your point now. I agree we should start it with None.

Another possible point of failure would be that ardupilot may be trying to use SERIAL3 for something. SERIAL3_PROTOCOL needs to be set to 0