PX4 / px4_ros_com

ROS2/ROS interface with PX4 through a Fast-RTPS bridge
http://px4.io
BSD 3-Clause "New" or "Revised" License
151 stars 173 forks source link

[Bug][fast-CDR] Deserialization of an (unidentified) uORB msg with a bool field is throwing an exception #12

Closed AuterionWrikeBot closed 5 years ago

AuterionWrikeBot commented 6 years ago

When running the micro-RTPS agent and after starting the micrortps_client daemon on SITL, it immediately throws an exception:

terminate called after throwing an instance of 'eprosima::fastcdr::exception::BadParamException'
  what():  Unexpected byte value in Cdr::deserialize(bool), expected 0 or 1
Aborted (core dumped)

This means there's a uORB topic which field has a value different from 0 or 1, which leads to this error.

AuterionWrikeBot commented 6 years ago

➤ Nuno Marques commented:

So to not block the release into open-source, I am going to reduce this list of messages to send/receive on the yaml file, and for now exclude the status messages that include boolean fields. Then I need to one-by-one understand which message in specific is causing this.

AuterionWrikeBot commented 5 years ago

➤ Nuno Marques commented:

After some debugging, this seems to be fixed (or at least not reproduceble). I am adding again some msgs to send/rcv list of uorb_rtps_message_ids.yaml.

AuterionWrikeBot commented 5 years ago

➤ Nuno Marques commented:

New PR: https://github.com/PX4/Firmware/pull/11120. In the debug process I was able to isolate some of the messages that result on the above error. Have now to understand the source of it.

AuterionWrikeBot commented 5 years ago

➤ Nuno Marques commented:

STATUS: Running the micrortps_agent with gdb allows me to find which fields in specific are failing to deserialize. Example of a stack trace:

6 0x00007ffff752b255 in eprosima::fastcdr::Cdr::deserialize(bool&) () from /opt/ros/crystal/lib/libfastcdr.so.1

7 0x000055555555ec0f in eprosima::fastcdr::Cdr::operator>> (this=0x7fffffffde10, bool_t=@0x7fffffffddf0: false) at /opt/ros/crystal/include/fastcdr/Cdr.h:545

545 inline Cdr& operator>>(bool &bool_t){return deserialize(bool_t);}

8 0x000055555555e8e0 in px4_roscom::msg::dds::VehicleStatus_::deserialize (this=0x7fffffffdde0, dcdr=...)

at /home/nuno/PX4/px4_ros_com_ros2/src/px4_ros_com/src/micrortps_agent/VehicleStatus_.cpp:447

447 dcdr >> m_isvtol;

9 0x000055555555d6f2 in RtpsTopics::publish (this=0x55555576b340 , topic_ID=102 'f', data_buffer=0x7fffffffdef0 "`\366%\265", len=1024)

at /home/nuno/PX4/px4_ros_com_ros2/src/px4_ros_com/src/micrortps_agent/RtpsTopics.cpp:57

57 st.deserialize(cdr_des);

10 0x000055555555af0b in main (argc=3, argv=0x7fffffffe3f8) at /home/nuno/PX4/px4_ros_com_ros2/src/px4_ros_com/src/micrortps_agent/microRTPS_agent.cpp:221

221 topics.publish(topic_ID, data_buffer, sizeof(data_buffer)); (gdb)

AuterionWrikeBot commented 5 years ago

➤ Julian Kent commented:

Nuno Marques using a RelWithDebInfo build will give you exact line numbers too

ArkadiuszNiemiec commented 5 years ago

I can confirm that the problem occurs with vehicle_status message.

TSC21 commented 5 years ago

I can confirm that the problem occurs with vehicle_status message.

Not just that one but that one as well. Still under research.

AuterionWrikeBot commented 5 years ago

➤ Nuno Marques commented:

Continue following up the issue in https://github.com/eProsima/Fast-CDR/issues/41#issuecomment-492836725

AuterionWrikeBot commented 5 years ago

➤ Nuno Marques commented:

A simple example of the vehicle_gps_position msg being received and parsed in a ROS2 node:

RECEIVED DATA ON VEHICLE GPS POSITION

ts: 208692000 lat: 0 lon: 0 alt: 473977505 alt_ellipsoid: 85456065 s_variance_m_s: 6.95467e-40 c_variance_rad: 0 fix_type: eph: 0 epv: 1 hdop: 1 vdop: 0 noise_per_ms: 0 vel_m_s: 0 vel_n_m_s: 0.02 vel_e_m_s: 0.02 vel_d_m_s: -0.01 cog_rad: 0.73 timestamp_time_relative: 1085801896 time_utc_usec: 9205357640488583168 satellites_used: heading: 0 heading_offset: 0 Both time_utc_usec and satellites_used seem to be improperly initialized, which may be a consequence (or cause), or even a side effect of some other issue, for the problem on the boolean field vel_ned_valid.

AuterionWrikeBot commented 5 years ago

➤ Nuno Marques commented:

Solution here: https://github.com/PX4/Firmware/pull/12025