open-rmf / free_fleet

A free fleet management system.
Apache License 2.0
156 stars 65 forks source link

free_fleet_server_ros2 crashes #113

Closed Ngochuy2137 closed 2 years ago

Ngochuy2137 commented 2 years ago

Bug report

Required information:

Description of the bug

After launch a ROS1 client, i launch a ROS2 server by ros2 launch ff_examples_ros2 fake_server.launch.xml, but the server node just informed as the following and it crashed

[INFO] [1638518022.483330172] [fake_server_node]: registered a new robot: [fake_ros1_robot]

[ERROR] [free_fleet_server_ros2-1]: process has died [pid 171912, exit code -11, cmd '/home/ngochuy/ros2_workspaces/multiRobot_ros2_ws/install/free_fleet_server_ros2/lib/free_fleet_server_ros2/free_fleet_server_ros2 --ros-args -r __node:=fake_server_node --params-file /tmp/launch_params_5f60anz4 --params-file /tmp/launch_params_y095tg1m --params-file /tmp/launch_params_5pjacfl9 --params-file /tmp/launch_params_mqlxvvgn --params-file /tmp/launch_params_sgoeuvs2 --params-file /tmp/launch_params_hea3pzog --params-file /tmp/launch_params_ppfoebdy --params-file /tmp/launch_params_p6v8acqt --params-file /tmp/launch_params_x50b77_d --params-file /tmp/launch_params_2x9dczw3 --params-file /tmp/launch_params_wjw0spop --params-file /tmp/launch_params_vvf4bnhq --params-file /tmp/launch_params_1jxnx647 --params-file /tmp/launch_params_w2jjb2z6 --params-file /tmp/launch_params_ihcizken --params-file /tmp/launch_params_rdikzrhe'].

Additional information

I have run the node free_fleet_server_ros2 directly, but the result was the same.

aaronchongth commented 2 years ago

Hello @Ngochuy2137! Thanks for raising this up. I just built and tested these launch files and haven't been able to replicate the problem at the moment. Could you provide the full log of launching fake_server.launch.xml?

Just as a sanity check before we investigate further, could you try a fresh build?

rm -rf build install log
colcon build --packages-up-to ff_examples_ros2
Ngochuy2137 commented 2 years ago

Hello @Ngochuy2137! Thanks for raising this up. I just built and tested these launch files and haven't been able to replicate the problem at the moment. Could you provide the full log of launching fake_server.launch.xml?

Just as a sanity check before we investigate further, could you try a fresh build?

rm -rf build install log
colcon build --packages-up-to ff_examples_ros2

Hi @aaronchongth, this is the full log when I launch fake_server without fake client node ros2 launch ff_examples_ros2 fake_server.launch.xml

ros2 launch ff_examples_ros2 fake_server.launch.xml [INFO] [launch]: All log files can be found below /home/ngochuy/.ros/log/2021-12-03-17-44-15-007291-ngochuy-HP-ProOne-400-G6-24-All-in-One-PC-221272 [INFO] [launch]: Default logging verbosity is set to INFO [INFO] [free_fleet_server_ros2-1]: process started with pid [221275] [free_fleet_server_ros2-1] Greetings from free_fleet_server_ros2 [free_fleet_server_ros2-1] ROS 2 SERVER CONFIGURATION [free_fleet_server_ros2-1] fleet name: fake_fleet [free_fleet_server_ros2-1] update state frequency: 20.0 [free_fleet_server_ros2-1] publish state frequency: 2.0 [free_fleet_server_ros2-1] TOPICS [free_fleet_server_ros2-1] fleet state: fleet_states [free_fleet_server_ros2-1] mode request: robot_mode_requests [free_fleet_server_ros2-1] path request: robot_path_requests [free_fleet_server_ros2-1] destination request: robot_destination_requests [free_fleet_server_ros2-1] SERVER-CLIENT DDS CONFIGURATION [free_fleet_server_ros2-1] dds domain: 42 [free_fleet_server_ros2-1] TOPICS [free_fleet_server_ros2-1] robot state: robot_state [free_fleet_server_ros2-1] mode request: mode_request [free_fleet_server_ros2-1] path request: path_request [free_fleet_server_ros2-1] destination request: destination_request [free_fleet_server_ros2-1] COORDINATE TRANSFORMATION [free_fleet_server_ros2-1] translation x (meters): -4.117 [free_fleet_server_ros2-1] translation y (meters): 27.260 [free_fleet_server_ros2-1] rotation (radians): -0.013 [free_fleet_server_ros2-1] scale: 0.928

After that, beside the above log, when i launched fake_client, it logged the following:

[free_fleet_server_ros2-1] [INFO] [1638528419.101723098] [fake_server_node]: registered a new robot: [fake_ros1_robot] [ERROR] [free_fleet_server_ros2-1]: process has died [pid 221275, exit code -11, cmd '/home/ngochuy/ros2_workspaces/multiRobot_ros2_ws/install/free_fleet_server_ros2/lib/free_fleet_server_ros2/free_fleet_server_ros2 --ros-args -r node:=fake_server_node --params-file /tmp/launch_params_6n1606 --params-file /tmp/launch_params_tewuysaw --params-file /tmp/launch_params_lstflnns --params-file /tmp/launch_params_bh2_l0ky --params-file /tmp/launch_params_cpf2222r --params-file /tmp/launch_params_qg4ufzyc --params-file /tmp/launch_params_wyfc92g2 --params-file /tmp/launch_params_edlpxc0z --params-file /tmp/launch_params_ilxvlpf3 --params-file /tmp/launch_params_9ossviiy --params-file /tmp/launch_params_psoxss1o --params-file /tmp/launch_params_r06vjel6 --params-file /tmp/launch_params_oq_vp6jd --params-file /tmp/launch_params_vvmolq74 --params-file /tmp/launch_params_xh_y5zgp --params-file /tmp/launch_params_7xg4is6p'].

When I launched the fake client from the beginning, then launched the fake server, all logs was appeared at one time

aaronchongth commented 2 years ago

I am still unable to re-create the issue, unfortunately. My apologies. These were the steps I took for my workspace, could you try them out and see if it still happens?

mkdir -p ~/ff2/src
cd ~/ff2/src
git clone https://github.com/open-rmf/free_fleet -b main
git clone https://github.com/open-rmf/rmf_internal_msgs -b main

cd ~/ff2
source /opt/ros/galactic/setup.bash
colcon build --packages-up-to ff_examples_ros2

source ~/ff2/install/setup.bash
ros2 launch ff_examples_ros2 fake_client.launch.xml

In another terminal,

source ~/ff2/install/setup.bash
ros2 launch ff_examples_ros2 fake_server.launch.xml
Ngochuy2137 commented 2 years ago

@aaronchongth I still haven't fixed the issue, I will try it later. Thanks for your help!

Aridani3 commented 2 years ago

Hello! I get the same error from my side. @Ngochuy2137 Were you able to solve it ?

veeraragav commented 2 years ago

Same issue here. I was working properly initially. And suddenly I started getting this issue. Here is the backtace. image

aaronchongth commented 2 years ago

thanks for bringing this up @veeraragav and @Aridani3! Is it possible to dump these logs onto a github gist instead of a screenshot?

As a sanity check, does it happen consistently after registering a new robot?

Could you also provide me with the .repos file for your workspace,

go to your server workspace,

cd server_ws
vcs export src > server.repos --exact

this will give me the best chance of reproducing the error. Thanks!

veeraragav commented 2 years ago

@aaronchongth Yes, it happened consistently after registering a new robot. Like how that issue appeared from nowhere, the issue has disappeared magically now. I hope it does not show up again. server.repos.txt

aaronchongth commented 2 years ago

Thanks for sharing it!

From your stack trace, it looks like it could be related to cyclonedds. I still have not been able to reproduce it as of now, but I will keep looking out for it.

For now, if it ever starts happening again, could you try switching the DDS implementation and trying again. This can help us determine if it was indeed related to cyclonedds

veeraragav commented 2 years ago

Sure. Thanks for the quick assistance!

On Thu, Jan 27, 2022 at 9:23 PM Aaron Chong @.***> wrote:

Thanks for sharing it!

From your stack trace, it looks like it could be related to cyclonedds. I still have not been able to reproduce it as of now, but I will keep looking out for it.

For now, if it ever starts happening again, could you try switching the DDS implementation https://docs.ros.org/en/galactic/How-To-Guides/Working-with-multiple-RMW-implementations.html and trying again. This can help us determine if it was indeed related to cyclonedds

— Reply to this email directly, view it on GitHub https://github.com/open-rmf/free_fleet/issues/113#issuecomment-1023898140, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFNL4O4PNITX7ZN3D46OAD3UYIR5DANCNFSM5JJCBQ7Q . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you were mentioned.Message ID: @.***>

TijnLosekoot commented 2 years ago

Thanks for sharing it!

From your stack trace, it looks like it could be related to cyclonedds. I still have not been able to reproduce it as of now, but I will keep looking out for it.

For now, if it ever starts happening again, could you try switching the DDS implementation and trying again. This can help us determine if it was indeed related to cyclonedds

Hello. We are experiencing the error mentioned in the OP on the ROS2 Foxy Distribution. Both with client and server. We have also tried running it with both FastRTPS and CycloneDDS, to no avail. Any other solutions?

veeraragav commented 2 years ago

I solved this issue on my side. It was my mistake. I had two ROS2 workspaces. The first one had all the packages from open-rmf and the second one had free_fleet and rmf_internal_msgs. I later found out that rmf_internal_msgs was also installed in the first workspace but a different version. That created this conflict.

aaronchongth commented 2 years ago

Hi @TijnLosekoot! Thanks for raising this up, sorry to hear about what you're facing. As a sanity check, could you perhaps try out the solution @veeraragav mentioned and see if you manage to get somewhere?

I would also suggest migrating to galactic, since it has many improvements over foxy, though the issue might still persist.

If not, could you please provide some more information so I can attempt to recreate the issue.

spydatron commented 2 years ago

Thanks for sharing it! From your stack trace, it looks like it could be related to cyclonedds. I still have not been able to reproduce it as of now, but I will keep looking out for it. For now, if it ever starts happening again, could you try switching the DDS implementation and trying again. This can help us determine if it was indeed related to cyclonedds

Hello. We are experiencing the error mentioned in the OP on the ROS2 Foxy Distribution. Both with client and server. We have also tried running it with both FastRTPS and CycloneDDS, to no avail. Any other solutions?

Hi @TijnLosekoot , I had the same problem and i figured out it was the same reason as @veeraragav mentioned earlier so i solved it by removing the other workspaces that have anything to do with rmf then i did the following steps to get it working

mkdir -p ~/ff_ws/src
cd ~/ff_ws/src
git clone https://github.com/open-rmf/free_fleet -b main

cd ~/ff_ws
source /opt/ros/galactic/setup.bash
colcon build --packages-up-to ff_examples_ros2

As you can see I did not clone this package "git clone https://github.com/open-rmf/rmf_internal_msgs -b main" because if I am correct, rmf_internal_msgs is included by default if you install ros2 foxy or galactic via the Debian binaries (e.g ros-galactic-desktop). So If you clone rmf_internal_msgs into your workspace and build with colcon then you get warning messages

Then in a new terminal, I executed:

source ~/ff_ws/install/setup.bash
ros2 launch ff_examples_ros2 fake_client.launch.xml

Then in a new terminal again I executed:

source ~/ff_ws/install/setup.bash
ros2 launch ff_examples_ros2 fake_server.launch.xml

The result:

free_fleet_example

samiframadhan commented 2 years ago

@spydatron solution works for me too

aaronchongth commented 2 years ago

Good to know, thanks for confirming! I will close this issue now.