dora-rs / dora

DORA (Dataflow-Oriented Robotic Application) is middleware designed to streamline and simplify the creation of AI-based robotic applications. It offers low latency, composable, and distributed dataflow capabilities. Applications are modeled as directed graphs, also referred to as pipelines.
https://dora-rs.ai
Apache License 2.0
1.36k stars 69 forks source link

error in run example "cxx-ros2-dataflow" in brach ros2-service-server #443

Closed RuPingCen closed 2 months ago

RuPingCen commented 3 months ago

Describe the bug

dora version : V0.3.2

error in run example "cxx-ros2-dataflow" in brach ros2-service-server

To Reproduce Steps to reproduce the behavior:

git clone --recursive https://github.com/dora-rs/dora.git

cd dora

git branch -a

git checkout ros2-service-server

cargo run --example cxx-ros2-dataflow --features ros2-examples

Expected behavior

warning: methods `matches__visualization_msgs__GetInteractiveMarkers` and `downcast__visualization_msgs__GetInteractiveMarkers` are never used
     --> /home/crp/dora/target/debug/build/dora-node-api-cxx-c86ef04f70223599/out/ros2_bindings.rs:51808:8
      |
51806 | impl Server__visualization_msgs__GetInteractiveMarkers {
      | ------------------------------------------------------ methods in this implementation
51807 |     #[allow(non_snake_case)]
51808 |     fn matches__visualization_msgs__GetInteractiveMarkers(
      |        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
...
51820 |     fn downcast__visualization_msgs__GetInteractiveMarkers(
      |        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

warning: `dora-node-api-cxx` (lib) generated 100 warnings
    Finished dev [unoptimized + debuginfo] target(s) in 0.20s
    Finished dev [unoptimized + debuginfo] target(s) in 0.19s
     Running `/home/crp/dora/target/debug/dora daemon --run-dataflow dataflow.yml`
  2024-03-28T11:05:30.561014Z ERROR dora_daemon: 
    018e84bc-33d0-7641-bbec-d8087f139bf0/cxx-node-rust-api failed with signal `SIGABRT`

    Check logs using: dora logs 018e84bc-33d0-7641-bbec-d8087f139bf0 cxx-node-rust-api

    at binaries/daemon/src/lib.rs:1097

  2024-03-28T11:05:30.561418Z  WARN dora_daemon::node_communication: failed to send NextFinishedDropTokens reply: NextDropEvents([])

Caused by:
   0: failed to send reply to node `cxx-node-rust-api`
   1: failed to send DaemonReply
   2: Broken pipe (os error 32)

Location:
    binaries/daemon/src/node_communication/tcp.rs:92:14
    at binaries/daemon/src/node_communication/mod.rs:268

failed to run dora-daemon: some nodes failed:
  - 018e84bc-33d0-7641-bbec-d8087f139bf0/cxx-node-rust-api: 
    018e84bc-33d0-7641-bbec-d8087f139bf0/cxx-node-rust-api failed with signal `SIGABRT`

    Check logs using: dora logs 018e84bc-33d0-7641-bbec-d8087f139bf0 cxx-node-rust-api

Error: failed to run dataflow

Location:
    examples/c++-ros2-dataflow/run.rs:162:9
(base) crp@HPZ4Workstation:~/dora$ dora logs 018e84bc-33d0-7641-bbec-d8087f139bf0 cxx-node-rust-api
unexpected reply to daemon logs: Error("No dataflow found with UUID `018e84bc-33d0-7641-bbec-d8087f139bf0`")

1

phil-opp commented 3 months ago

Thanks for reporting! Note that PR #442 is still marked as draft because it is not finished yet.

My guess is that the SIGABRT is caused by some assertion failure in the C++ code, but it's difficult to say without the log output. Could you check whether there's a out subfolder after running? Dora should write all the node logs to it since https://github.com/dora-rs/dora/pull/429 . Maybe you can find the full output of the C++ node there?

RuPingCen commented 3 months ago

here is the out log of 018e84bc-33d0-7641-bbec-d8087f139bf0

HELLO FROM C++ timeout while waiting for service, retrying timeout while waiting for service, retrying timeout while waiting for service, retrying timeout while waiting for service, retrying timeout while waiting for service, retrying timeout while waiting for service, retrying timeout while waiting for service, retrying timeout while waiting for service, retrying timeout while waiting for service, retrying timeout while waiting for service, retrying terminate called after throwing an instance of 'rust::cxxbridge1::Error' what(): service not available

phil-opp commented 3 months ago

Did you start the ros2 run demo_nodes_cpp add_two_ints_server command as described in https://github.com/dora-rs/dora/blob/main/examples/c%2B%2B-ros2-dataflow/README.md#running-service-example?

bobd988 commented 3 months ago

I can repeat the above error with ros2 run demo_nodes_cpp add_two_ints_server running from another terminal. I think this issue is a new issue since around 3/26 as I ran this example successfully one week ago.

bobd988 commented 3 months ago

I have tried with 2 more clean Ubuntu 20.04 and can repeat the issue.

and while the error reported from terminal cargo run --example cxx-ros2-dataflow --features ros2-examples

the The ROS2 terminal command is

ros2 run demo_nodes_cpp add_two_ints_server

and it reported below

[INFO] [1711949751.317727832] [add_two_ints_server]: Incoming request
a: 0 b: 0
haixuanTao commented 3 months ago

Just want to say, that you can create a docker instance to emulate a new Ubuntu version using:

docker run --network=host -e DISPLAY=${DISPLAY} -v $(pwd):/current_dora_folder -it osrf/ros:humble-desktop
phil-opp commented 3 months ago

We just had a meeting together to debug this. The cause for this error was that ROS2 galactic uses Cyclone DDS by default, but our ROS2 bridge currently hardcodes a different service mapping (see #449). To work around this issue, set RMW_IMPLEMENTATION=rmw_fastrtps_cpp in both terminals (i.e. in the terminal running the service server and in the terminal running the cxx-ros2-dataflow example).

RuPingCen commented 3 months ago

This problem still exists when I use a new computer, the version of ROS2 is galactic and ubuntu20.04. I have set the parameter RMW_IMPLEMENTATION by using export RMW_IMPLEMENTATION=rmw_fastrtps_cpp .

2024-04-08 11-01-49 的屏幕截图

And the log out :

HELLO FROM C++ timeout while waiting for service, retrying timeout while waiting for service, retrying timeout while waiting for service, retrying timeout while waiting for service, retrying timeout while waiting for service, retrying timeout while waiting for service, retrying timeout while waiting for service, retrying timeout while waiting for service, retrying timeout while waiting for service, retrying timeout while waiting for service, retrying terminate called after throwing an instance of 'rust::cxxbridge1::Error' what(): service not available

phil-opp commented 3 months ago

You need to launch the example add_two_ints service server of ROS2 in a separate terminal, also with RMW_IMPLEMENTATION=rmw_fastrtps_cpp.