Closed Ryanf55 closed 12 months ago
Currently facing an issue with running the microros agent with SITL (UDP), wherein the microros agent just terminates due to "bad_array_new_length". No topics thus published on ROS2.
Cannot even restart the microros to fix this, as the then the reconnection is not possible. What could be the reason for the following error?
[mavproxy.py --out 127.0.0.1:14550 --out 127.0.0.1:14551 --master tcp:127.0.0.1:5760 --sitl 127.0.0.1:5501 --non-interactive -3] AP: Frame: QUAD/PLUS [micro_ros_agent-1] [1694444766.334808] info | Root.cpp | create_client | create | client_key: 0xAAAABBBB, session_id: 0x81 [micro_ros_agent-1] [1694444766.335212] info | SessionManager.hpp | establish_session | session established | client_key: 0xAAAABBBB, address: 127.0.0.1:36817 [mavproxy.py --out 127.0.0.1:14550 --out 127.0.0.1:14551 --master tcp:127.0.0.1:5760 --sitl 127.0.0.1:5501 --non-interactive -3] [micro_ros_agent-1] terminate called after throwing an instance of 'std::bad_array_new_length' [micro_ros_agent-1] what(): std::bad_array_new_length [mavproxy.py --out 127.0.0.1:14550 --out 127.0.0.1:14551 --master tcp:127.0.0.1:5760 --sitl 127.0.0.1:5501 --non-interactive -3] AP: ArduPilot Ready [mavproxy.py --out 127.0.0.1:14550 --out 127.0.0.1:14551 --master tcp:127.0.0.1:5760 --sitl 127.0.0.1:5501 --non-interactive -3] AP: AHRS: DCM active [mavproxy.py --out 127.0.0.1:14550 --out 127.0.0.1:14551 --master tcp:127.0.0.1:5760 --sitl 127.0.0.1:5501 --non-interactive -3] AP: DDS Client: Init Complete [ERROR] [micro_ros_agent-1]: process has died [pid 90931, exit code -6, cmd '/home/vibsin/workspace/DroneSim/ros2_ardup_ws/install/micro_ros_agent/lib/micro_ros_agent/micro_ros_agent udp4 --middleware dds --port 2019 --refs /home/vibsin/workspace/DroneSim/ros2_ardup_ws/install/ardupilot_sitl/share/ardupilot_sitl/config/dds_xrce_profile.xml --ros-args -r node:=micro_ros_agent -r ns:=/']. [mavproxy.py --out 127.0.0.1:14550 --out 127.0.0.1:14551 --master tcp:127.0.0.1:5760 --sitl 127.0.0.1:5501 --non-interactive -3] AP: XRCE Client: Participant session request failure [mavproxy.py --out 127.0.0.1:14550 --out 127.0.0.1:14551 --master tcp:127.0.0.1:5760 --sitl 127.0.0.1:5501 --non-interactive -3] AP: DDS Client: Creation Requests failed [mavproxy.py --out 127.0.0.1:14550 --out 127.0.0.1:14551 --master tcp:127.0.0.1:5760 --sitl 127.0.0.1:5501 --non-interactive -3] AP: RC7: SaveWaypoint LOW [mavproxy.py --out 127.0.0.1:14550 --out 127.0.0.1:14551 --master tcp:127.0.0.1:5760 --sitl 127.0.0.1:5501 --non-interactive -3] paramftp: bad count 1327 should be 1325 [mavproxy.py --out 127.0.0.1:14550 --out 127.0.0.1:14551 --master tcp:127.0.0.1:5760 --sitl 127.0.0.1:5501 --non-interactive -3] AP: ArduCopter V4.5.0-dev (768e2409)
@vibgyor-s I can't tell what might be causing this from the log you posted. Could you post all steps required to replicate (terminal commands and full log) and some details about the system you're running.
The PX4 uxrce_dds_client
has some support for reconnecting to the micro-ROS agent if the connection is dropped. It makes use of the uxr
ping functions declared in uxr/client/util/ping.h to monitor the connection status:
uxr_ping_agent
uxr_ping_agent_attempts
To implement similar behaviour in ArduPilot AP_DDS
we need the following:
status_ok
and assign this to the result of the uxr_run_session_time
in update.connected
for the result of a ping test.init
function into init_transport
and init_session
.main_loop
.uxr_delete_session_retries
if the connection is dropped.AP_DDS_Client
destructor.uxr_init_session
out of ddsSerialInit
and ddsUdpInit
as it must be called on reconnect.Tracking in: https://github.com/ArduPilot/ardupilot/pull/25228
uxr
client library segfaults.uxr_init_session
when reconnecting, so move out of AP_DDS_Serial
and AP_DDS_UDP
.Figure: reconnection after micro-ROS agent is repeatedly restarted.
Hi @srmainwaring, I've merged your "pr_dds_reconnect" branch. I'm getting a weird issue where once I disconnect the DDS client, it will register the "disconnecting", but after a couple of seconds it will then "exit". After this exit I can't reconnect to the client without doing a power reset. Any help would be awesome, cheers.
Hi @KyleJewiss, thanks for testing the PR. The timeout after a 10s seconds is intentional.
If a connection cannot be reestablished after 10s the loop exits.
// check ping
const uint64_t ping_timeout_ms{1000};
const uint8_t ping_max_attempts{10};
if (!uxr_ping_agent_attempts(comm, ping_timeout_ms, ping_max_attempts)) {
GCS_SEND_TEXT(MAV_SEVERITY_ERROR, "DDS Client: No ping response, exiting");
return;
}
We need to implement fall-back behaviour in a future PR.
Good to know. Thanks for the for the code and the reply @srmainwaring. Have a good one
Btw - were you testing in SITL or hardware?
At the moment we can manage a reconnect of the client if the micro-ROS agent dies and is respawned (within 10s).
Unplugging and reconnecting a serial to USB adapter connecting a flight controller to a PC is not working. I have not tested a connection between a FCU and GPIO pins on a companion computer such as an RPi4.
We were testing on hardware, that makes sense. We can close the agent and reconnect in those 10 seconds but if we take longer, we need to unplug and plug back in.
Currently facing an issue with running the microros agent with SITL (UDP), wherein the microros agent just terminates due to "bad_array_new_length". No topics thus published on ROS2.
Cannot even restart the microros to fix this, as the then the reconnection is not possible. What could be the reason for the following error?
[mavproxy.py --out 127.0.0.1:14550 --out 127.0.0.1:14551 --master tcp:127.0.0.1:5760 --sitl 127.0.0.1:5501 --non-interactive -3] AP: Frame: QUAD/PLUS [micro_ros_agent-1] [1694444766.334808] info | Root.cpp | create_client | create | client_key: 0xAAAABBBB, session_id: 0x81 [micro_ros_agent-1] [1694444766.335212] info | SessionManager.hpp | establish_session | session established | client_key: 0xAAAABBBB, address: 127.0.0.1:36817 [mavproxy.py --out 127.0.0.1:14550 --out 127.0.0.1:14551 --master tcp:127.0.0.1:5760 --sitl 127.0.0.1:5501 --non-interactive -3] [micro_ros_agent-1] terminate called after throwing an instance of 'std::bad_array_new_length' [micro_ros_agent-1] what(): std::bad_array_new_length [mavproxy.py --out 127.0.0.1:14550 --out 127.0.0.1:14551 --master tcp:127.0.0.1:5760 --sitl 127.0.0.1:5501 --non-interactive -3] AP: ArduPilot Ready [mavproxy.py --out 127.0.0.1:14550 --out 127.0.0.1:14551 --master tcp:127.0.0.1:5760 --sitl 127.0.0.1:5501 --non-interactive -3] AP: AHRS: DCM active [mavproxy.py --out 127.0.0.1:14550 --out 127.0.0.1:14551 --master tcp:127.0.0.1:5760 --sitl 127.0.0.1:5501 --non-interactive -3] AP: DDS Client: Init Complete [ERROR] [micro_ros_agent-1]: process has died [pid 90931, exit code -6, cmd '/home/vibsin/workspace/DroneSim/ros2_ardup_ws/install/micro_ros_agent/lib/micro_ros_agent/micro_ros_agent udp4 --middleware dds --port 2019 --refs /home/vibsin/workspace/DroneSim/ros2_ardup_ws/install/ardupilot_sitl/share/ardupilot_sitl/config/dds_xrce_profile.xml --ros-args -r node:=micro_ros_agent -r ns:=/']. [mavproxy.py --out 127.0.0.1:14550 --out 127.0.0.1:14551 --master tcp:127.0.0.1:5760 --sitl 127.0.0.1:5501 --non-interactive -3] AP: XRCE Client: Participant session request failure [mavproxy.py --out 127.0.0.1:14550 --out 127.0.0.1:14551 --master tcp:127.0.0.1:5760 --sitl 127.0.0.1:5501 --non-interactive -3] AP: DDS Client: Creation Requests failed [mavproxy.py --out 127.0.0.1:14550 --out 127.0.0.1:14551 --master tcp:127.0.0.1:5760 --sitl 127.0.0.1:5501 --non-interactive -3] AP: RC7: SaveWaypoint LOW [mavproxy.py --out 127.0.0.1:14550 --out 127.0.0.1:14551 --master tcp:127.0.0.1:5760 --sitl 127.0.0.1:5501 --non-interactive -3] paramftp: bad count 1327 should be 1325 [mavproxy.py --out 127.0.0.1:14550 --out 127.0.0.1:14551 --master tcp:127.0.0.1:5760 --sitl 127.0.0.1:5501 --non-interactive -3] AP: ArduCopter V4.5.0-dev (768e240)
This related to unresolved: https://github.com/micro-ROS/micro-ROS-Agent/issues/205
If connection between the autopilot and companion computer is flaky, severed, or the micro ROS agent restarts at runtime, the connection is not recovered.
The scope of this issue is to perform the following