eclipse-zenoh / zenoh-plugin-dds

A zenoh plug-in that allows to transparently route DDS data. This plugin can be used by DDS applications to leverage zenoh for geographical routing or for better scaling discovery. For ROS2 robotic applications, use https://github.com/eclipse-zenoh/zenoh-plugin-ros2dds
Other
167 stars 58 forks source link

[Bug] Crash with Cyclone 0.10.5 ping pong examples #346

Closed gabrik closed 2 weeks ago

gabrik commented 1 month ago

Describe the bug

The pluing crashes on discovery of ping/pong participants.

zenoh-1  | 2024-09-18T12:29:00.478812Z  INFO main ThreadId(01) zenohd: zenohd v9bc441f-modified built with rustc 1.81.0 .....
zenoh-1  | 2024-09-18T12:29:00.538520Z DEBUG tokio-runtime-worker ThreadId(59) zenoh_plugin_dds::route_zenoh_dds: Route Zenoh->DDS (ping/RoundTrip -> RoundTrip): creation with topic_type=RoundTripModule::DataType querying_subscriber=false
zenoh-1  | 2024-09-18T12:29:00.538541Z DEBUG                 main ThreadId(01) zenoh::net::runtime::orchestrator: Joined multicast group 224.0.0.224 on interface 172.18.0.2
zenoh-1  | 2024-09-18T12:29:00.538551Z  INFO                 main ThreadId(01) zenoh::net::runtime::orchestrator: zenohd listening scout messages on 224.0.0.224:7446
zenoh-1  | 2024-09-18T12:29:00.538565Z DEBUG tokio-runtime-worker ThreadId(59) zenoh::net::routing::dispatcher::pubsub: Face{1, dd5} Declare subscriber 13 (ping/RoundTrip)
zenoh-1  | thread '<unnamed>' panicked at /usr/local/cargo/git/checkouts/zenoh-plugin-dds-89876188164765d1/5cc50d6/zenoh-plugin-dds/src/dds_mgt.rs:135:28:
zenoh-1  | misaligned pointer dereference: address must be a multiple of 0x8 but is 0x800000004
zenoh-1  | stack backtrace:
zenoh-1  | 2024-09-18T12:29:00.538608Z DEBUG tokio-runtime-worker ThreadId(59) zenoh::net::routing::dispatcher::resource: Register resource ping/RoundTrip
zenoh-1  | 2024-09-18T12:29:00.538692Z DEBUG                 main ThreadId(01) zenoh::net::runtime::orchestrator: UDP port bound to 172.18.0.2:56918
zenoh-1  | 2024-09-18T12:29:00.538781Z DEBUG tokio-runtime-worker ThreadId(59) zenoh_plugin_dds::route_zenoh_dds: Route Zenoh->DDS (ping/RoundTrip -> RoundTrip): create DDS Writer
zenoh-1  | 2024-09-18T12:29:00.538820Z DEBUG                net-0 ThreadId(67) zenoh::net::runtime::orchestrator: Waiting for UDP datagram...
zenoh-1  | 2024-09-18T12:29:00.538975Z  INFO tokio-runtime-worker ThreadId(59) zenoh_plugin_dds: Route Zenoh->DDS (ping/RoundTrip -> RoundTrip): created with topic_type=RoundTripModule::DataType
zenoh-1  | 2024-09-18T12:29:00.539135Z DEBUG tokio-runtime-worker ThreadId(59) zenoh_plugin_dds: Discovered DDS Writer 0110356d96aa96de8aca75ee00000203 on RoundTrip with type 'RoundTripModule::DataType' and QoS: Qos { user_data: None, topic_data: None, group_data: None, durability: None, durability_service: None, presentation: None, deadline: None, latency_budget: None, ownership: None, ownership_strength: None, liveliness: None, time_based_filter: None, partition: Some(["ping"]), reliability: Some(Reliability { kind: RELIABLE, max_blocking_time: 10000000000 }), transport_priority: None, lifespan: None, destination_order: None, history: None, resource_limits: None, writer_data_lifecycle: Some(WriterDataLifecycle { autodispose_unregistered_instances: false }), reader_data_lifecycle: None, writer_batching: None, type_consistency: None, entity_name: None, properties: None, ignore_local: None, data_representation: Some([0, 2]) }
zenoh-1  | 2024-09-18T12:29:00.539226Z DEBUG tokio-runtime-worker ThreadId(59) zenoh_plugin_dds::route_dds_zenoh: Route DDS->Zenoh (RoundTrip -> ping/RoundTrip): creation with topic_type=RoundTripModule::DataType
zenoh-1  | 2024-09-18T12:29:00.539432Z DEBUG tokio-runtime-worker ThreadId(59) zenoh::net::routing::dispatcher::interests: Face{1, dd5} Declare interest 14 (ping/RoundTrip)
zenoh-1  | 2024-09-18T12:29:00.539501Z DEBUG tokio-runtime-worker ThreadId(59) zenoh::net::routing::dispatcher::interests: Face{1, dd5} Undeclare interest 14
zenoh-1  | 2024-09-18T12:29:00.539783Z  INFO tokio-runtime-worker ThreadId(59) zenoh_plugin_dds: Route DDS->Zenoh (RoundTrip -> ping/RoundTrip): created with topic_type=RoundTripModule::DataType
zenoh-1  | 2024-09-18T12:29:00.539853Z DEBUG tokio-runtime-worker ThreadId(59) zenoh_plugin_dds: Discovered DDS Reader 0110356d96aa96de8aca75ee00000304 on RoundTrip with type 'RoundTripModule::DataType' and QoS: Qos { user_data: None, topic_data: None, group_data: None, durability: None, durability_service: None, presentation: None, deadline: None, latency_budget: None, ownership: None, ownership_strength: None, liveliness: None, time_based_filter: None, partition: Some(["pong"]), reliability: Some(Reliability { kind: RELIABLE, max_blocking_time: 10000000000 }), transport_priority: None, lifespan: None, destination_order: None, history: None, resource_limits: None, writer_data_lifecycle: None, reader_data_lifecycle: None, writer_batching: None, type_consistency: Some(TypeConsistency { kind: ALLOW_TYPE_COERCION, ignore_sequence_bounds: true, ignore_string_bounds: true, ignore_member_names: false, prevent_type_widening: false, force_type_validation: false }), entity_name: None, properties: None, ignore_local: None, data_representation: Some([0, 2]) }
zenoh-1  | 2024-09-18T12:29:00.539909Z DEBUG tokio-runtime-worker ThreadId(59) zenoh_plugin_dds::route_zenoh_dds: Route Zenoh->DDS (pong/RoundTrip -> RoundTrip): creation with topic_type=RoundTripModule::DataType querying_subscriber=false
zenoh-1  | 2024-09-18T12:29:00.539951Z DEBUG tokio-runtime-worker ThreadId(59) zenoh::net::routing::dispatcher::pubsub: Face{1, dd5} Declare subscriber 15 (pong/RoundTrip)
zenoh-1  | 2024-09-18T12:29:00.540068Z DEBUG tokio-runtime-worker ThreadId(59) zenoh_plugin_dds::route_zenoh_dds: Route Zenoh->DDS (pong/RoundTrip -> RoundTrip): create DDS Writer
zenoh-1  | 2024-09-18T12:29:00.540162Z  INFO tokio-runtime-worker ThreadId(59) zenoh_plugin_dds: Route Zenoh->DDS (pong/RoundTrip -> RoundTrip): created with topic_type=RoundTripModule::DataType
zenoh-1  |    0:     0x558ceada50b5 - std::backtrace_rs::backtrace::libunwind::trace::h1b4b7c200f0b1134
zenoh-1  |                                at /rustc/eeb90cda1969383f56a2637cbd3037bdf598841c/library/std/src/../../backtrace/src/backtrace/libunwind.rs:116:5
zenoh-1  |    1:     0x558ceada50b5 - std::backtrace_rs::backtrace::trace_unsynchronized::h40ba1b1a720cbf98
zenoh-1  |                                at /rustc/eeb90cda1969383f56a2637cbd3037bdf598841c/library/std/src/../../backtrace/src/backtrace/mod.rs:66:5
zenoh-1  |    2:     0x558ceada50b5 - std::sys::backtrace::_print_fmt::h8992bf724ef0f4bd
zenoh-1  |                                at /rustc/eeb90cda1969383f56a2637cbd3037bdf598841c/library/std/src/sys/backtrace.rs:65:5
zenoh-1  |    3:     0x558ceada50b5 - <std::sys::backtrace::BacktraceLock::print::DisplayBacktrace as core::fmt::Display>::fmt::h358afad87e02ca76
zenoh-1  |                                at /rustc/eeb90cda1969383f56a2637cbd3037bdf598841c/library/std/src/sys/backtrace.rs:40:26
zenoh-1  |    4:     0x558ceadd26bb - core::fmt::rt::Argument::fmt::h414c419d4b9c8d82
zenoh-1  |                                at /rustc/eeb90cda1969383f56a2637cbd3037bdf598841c/library/core/src/fmt/rt.rs:173:76
zenoh-1  |    5:     0x558ceadd26bb - core::fmt::write::hb19b5b269a2fe458
zenoh-1  |                                at /rustc/eeb90cda1969383f56a2637cbd3037bdf598841c/library/core/src/fmt/mod.rs:1182:21
zenoh-1  |    6:     0x558ceada0aff - std::io::Write::write_fmt::he5a92676a45ef09d
zenoh-1  |                                at /rustc/eeb90cda1969383f56a2637cbd3037bdf598841c/library/std/src/io/mod.rs:1827:15
zenoh-1  |    7:     0x558ceada63a1 - std::sys::backtrace::BacktraceLock::print::h6d30d1c1b5775240
zenoh-1  |                                at /rustc/eeb90cda1969383f56a2637cbd3037bdf598841c/library/std/src/sys/backtrace.rs:43:9
zenoh-1  |    8:     0x558ceada63a1 - std::panicking::default_hook::{{closure}}::h3bff550b24d93725
zenoh-1  |                                at /rustc/eeb90cda1969383f56a2637cbd3037bdf598841c/library/std/src/panicking.rs:269:22
zenoh-1  |    9:     0x558ceada607c - std::panicking::default_hook::hd53b1b06d2b99687
zenoh-1  |                                at /rustc/eeb90cda1969383f56a2637cbd3037bdf598841c/library/std/src/panicking.rs:296:9
zenoh-1  |   10:     0x558ceada6a01 - std::panicking::rust_panic_with_hook::h9fdd87cddb2763da
zenoh-1  |                                at /rustc/eeb90cda1969383f56a2637cbd3037bdf598841c/library/std/src/panicking.rs:800:13
zenoh-1  |   11:     0x558ceada6867 - std::panicking::begin_panic_handler::{{closure}}::h089783ab6b5cba45
zenoh-1  |                                at /rustc/eeb90cda1969383f56a2637cbd3037bdf598841c/library/std/src/panicking.rs:674:13
zenoh-1  |   12:     0x558ceada5579 - std::sys::backtrace::__rust_end_short_backtrace::hed34776d77ef7922
zenoh-1  |                                at /rustc/eeb90cda1969383f56a2637cbd3037bdf598841c/library/std/src/sys/backtrace.rs:168:18
zenoh-1  |   13:     0x558ceada64f4 - rust_begin_unwind
zenoh-1  |                                at /rustc/eeb90cda1969383f56a2637cbd3037bdf598841c/library/std/src/panicking.rs:665:5
zenoh-1  |   14:     0x558ce52b1eb5 - core::panicking::panic_nounwind_fmt::runtime::h532214566a18e21a
zenoh-1  |                                at /rustc/eeb90cda1969383f56a2637cbd3037bdf598841c/library/core/src/panicking.rs:112:18
zenoh-1  |   15:     0x558ce52b1eb5 - core::panicking::panic_nounwind_fmt::hb4a9d2ef6221fb83
zenoh-1  |                                at /rustc/eeb90cda1969383f56a2637cbd3037bdf598841c/library/core/src/panicking.rs:122:5
zenoh-1  |   16:     0x558ce52b2143 - core::panicking::panic_misaligned_pointer_dereference::h12fb9a4d9a9133a3
zenoh-1  |                                at /rustc/eeb90cda1969383f56a2637cbd3037bdf598841c/library/core/src/panicking.rs:289:5
zenoh-1  |   17:     0x558ce78dfb1e - zenoh_plugin_dds::dds_mgt::DDSRawSample::create::h416329d7ae440f17
zenoh-1  |                                at /usr/local/cargo/git/checkouts/zenoh-plugin-dds-89876188164765d1/5cc50d6/zenoh-plugin-dds/src/dds_mgt.rs:135:28
zenoh-1  |   18:     0x558ce78e7982 - zenoh_plugin_dds::dds_mgt::data_forwarder_listener::hf49137b96f82b986
zenoh-1  |                                at /usr/local/cargo/git/checkouts/zenoh-plugin-dds-89876188164765d1/5cc50d6/zenoh-plugin-dds/src/dds_mgt.rs:428:30
zenoh-1  |   19:     0x558ce76c4217 - da_or_dor_cb_invoke
zenoh-1  |                                at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/cyclors-0.2.1/cyclonedds/src/core/ddsc/src/dds_reader.c:240:5
zenoh-1  |   20:     0x558ce76c42c3 - dds_reader_data_available_cb
zenoh-1  |                                at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/cyclors-0.2.1/cyclonedds/src/core/ddsc/src/dds_reader.c:264:14
zenoh-1  |   21:     0x558ce76cbb28 - dds_rhc_default_store
zenoh-1  |                                at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/cyclors-0.2.1/cyclonedds/src/core/ddsc/src/dds_rhc_default.c:1764:7
zenoh-1  |   22:     0x558ce7787649 - ddsi_rhc_store
zenoh-1  |                                at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/cyclors-0.2.1/cyclonedds/src/core/ddsi/include/dds/ddsi/ddsi_rhc.h:64:10
zenoh-1  |   23:     0x558ce770bf3d - ddsi_deliver_locally_one
zenoh-1  |                                at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/cyclors-0.2.1/cyclonedds/src/core/ddsi/src/ddsi_deliver_locally.c:149:13
zenoh-1  |   24:     0x558ce77582c8 - deliver_user_data
zenoh-1  |                                at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/cyclors-0.2.1/cyclonedds/src/core/ddsi/src/ddsi_receive.c:2246:12
zenoh-1  |   25:     0x558ce77583fd - deliver_user_data_synchronously
zenoh-1  |                                at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/cyclors-0.2.1/cyclonedds/src/core/ddsi/src/ddsi_receive.c:2276:7
zenoh-1  |   26:     0x558ce7754bc9 - handle_Heartbeat
zenoh-1  |                                at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/cyclors-0.2.1/cyclonedds/src/core/ddsi/src/ddsi_receive.c:1402:17
zenoh-1  |   27:     0x558ce775b095 - handle_submsg_sequence
zenoh-1  |                                at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/cyclors-0.2.1/cyclonedds/src/core/ddsi/src/ddsi_receive.c:3042:11
zenoh-1  |   28:     0x558ce775bdf3 - handle_rtps_message
zenoh-1  |                                at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/cyclors-0.2.1/cyclonedds/src/core/ddsi/src/ddsi_receive.c:3223:7
zenoh-1  |   29:     0x558ce775c084 - do_packet
zenoh-1  |                                at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/cyclors-0.2.1/cyclonedds/src/core/ddsi/src/ddsi_receive.c:3321:5
zenoh-1  |   30:     0x558ce775cb61 - ddsi_recv_thread
zenoh-1  |                                at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/cyclors-0.2.1/cyclonedds/src/core/ddsi/src/ddsi_receive.c:3529:14
zenoh-1  |   31:     0x558ce775e81d - create_thread_wrapper
zenoh-1  |                                at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/cyclors-0.2.1/cyclonedds/src/core/ddsi/src/ddsi_thread.c:254:24
zenoh-1  |   32:     0x558ce76eb185 - os_startRoutineWrapper
zenoh-1  |                                at /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/cyclors-0.2.1/cyclonedds/src/ddsrt/src/threads/posix/threads.c:190:17
zenoh-1  |   33:     0x7fc10c3fabc2 - start
zenoh-1  |                                at /home/buildozer/aports/main/musl/src/1.2.4/src/thread/pthread_create.c:207:2
zenoh-1  | thread caused non-unwinding panic. aborting.

Cause seems:

zenoh-1  | thread '<unnamed>' panicked at /usr/local/cargo/git/checkouts/zenoh-plugin-dds-89876188164765d1/5cc50d6/zenoh-plugin-dds/src/dds_mgt.rs:135:28:
zenoh-1  | misaligned pointer dereference: address must be a multiple of 0x8 but is 0x800000004

To reproduce

Start the bridge Start ping pong exampls

System info

All

gmartin82 commented 1 month ago

@gabrik was the binary built with support for the 'zenoh-plugin-ros2dds' included?

The ROS2 and DDS plugins use different versions of the cyclors library and it looks like when a binary is built with both of these plugins included there is a clash between the versions. In the stack trace it can be seen that cyclors 0.2.1 is being invoked whereas the DDS plugin should be using cyclors 0.3.1.

gabrik commented 1 month ago

Yes, that's the case. I wonder how this happens as rust builds both version of cyclors and it should be able to link both versions.

gmartin82 commented 1 month ago

In this case, Cyclone DDS is calling back to Rust when a sample is received. The issue is possibly caused by combining multiple versions of the Cyclone C library into a single binary.

gabrik commented 1 month ago

Yeap but:

$ cargo tree -i cyclors --features plugins
error: There are multiple `cyclors` packages in your project, and the specification `cyclors` is ambiguous.
Please re-run this command with one of the following specifications:
  cyclors@0.2.1
  cyclors@0.3.1

So rust does know that there are 2 different version, and know which plugin belongs to which version

$ cargo tree -i cyclors@0.3.1 --features plugins
cyclors v0.3.1
└── zenoh-plugin-dds v1.0.0-dev (https://github.com/eclipse-zenoh/zenoh-plugin-dds?branch=main#6a12ef01)
    └── <redacted> (/home/gabrik/Workspace/<redacted>)
$ cargo tree -i cyclors@0.2.1 --features plugins
cyclors v0.2.1
└── zenoh-plugin-ros2dds v1.0.0-dev (https://github.com/eclipse-zenoh/zenoh-plugin-ros2dds?branch=main#1e53a50b)
    └── <redacted> (/home/gabrik/Workspace/<redacted>)

the only way the DDS plugin can call the "wrong" cyclors is if the underlying C functions are "overwritten" at compile time by the old ones.

~~Cant's cyclors "mangle" the names including the version? This would prevent this sort of issues. Or maybe something needs to be setup in cargo configuration~~

Nevermind the underlying CycloneDDS C function will always have the same name, thus it will not solve.

gmartin82 commented 1 month ago

Yes, I think C functions are likely being "overwritten" during linking.

It might be possible to "mangle" the Cyclone symbols (possibly by adding a prefix) to make them unique for each version of Cyclors. This is something that might be worth investigating if we do need to use multiple versions of Cyclone at once.

gabrik commented 2 weeks ago

Solved by #376