ros2 / rmw_fastrtps

Implementation of the ROS Middleware (rmw) Interface using eProsima's Fast RTPS.
Apache License 2.0
157 stars 117 forks source link

Missing nodes on host Windows 10 Humble #628

Open genevanmeter opened 2 years ago

genevanmeter commented 2 years ago

Bug report

Required Info:

Steps to reproduce issue

Startup multiple talker nodes. Bat script below starts 10 nodes with unique node and topics

@echo off
title ROS2 Command Prompt
call \dev\ros2_humble\local_setup.bat
setlocal enabledelayedexpansion
for /l %%i in (1,1,10) do (
    set n=0%%i
    set n=!n:~-2!
    start ros2 run demo_nodes_cpp talker --ros-args -r __node:=talker!n! -r chatter:=chatter!n!
)
endlocal

Expected behavior

ros2 node list
/talker01
/talker02
/talker03
/talker04
/talker05
/talker06
/talker07
/talker08
/talker09
/talker10

Actual behavior

None or a random few nodes are returned

ros2 node list
/talker05
/talker07
/talker10
ros2 topic list
/chatter05
/chatter07
/chatter10
/parameter_events
/rosout

ros2 topic echo /chatter05 -- Does nothing

Additional information

While the host machine is exhibiting this issue, from a WSL instance on the same computer as well as on a separate Linux machine I get the full list and appears to be working correctly. It only appears to be an issue with the host Windows 10 running the nodes. See also (https://answers.ros.org/question/402008/ros2-windows-node-always-hang/)

EduPonz commented 2 years ago

Hi @genevanmeter ,

Does this happen after reboots on the Windows host?

genevanmeter commented 2 years ago

This does happen after most reboots. Rarely it works as expected and that only lasts for a little while, ie I'll run a new node and it doesn't show up on the host machine node list but visible on WSL or Linux.

genevanmeter commented 2 years ago

I have found if I terminate the daemon running in python from Process Explorer or run ros2 daemon stop that I'm able to temporarily resolve the issue.

image

genevanmeter commented 2 years ago

Stopping and starting the daemon workaround no longer works for me on Windows. Every command hangs even after reboots.

ros2 topic list -- hangs ros2 node list -- hangs ros2 run demo_nodes_cpp talker -- hangs ros2 doctor -r -- hangs

I've reinstalled the binaries and build from source with the same result. At this point it seems the RMW is corrupted like the ROS Q&A link in my OP.

Are there any files written somewhere ie in APPDATA that are possibly corrupted?

MiguelCompany commented 2 years ago

@genevanmeter You could take a look at C:\ProgramData\eprosima\fastrtps_interprocess. Removing all the files there after reboot should help.

genevanmeter commented 2 years ago

@MiguelCompany Thank you. Instantly resolved all of my issues.

C:\ProgramData\eprosima\fastrtps_interprocess was nearly 400MB but quite small when zipped. I copied the files before deleting. Are there any that may be of use for diagnosing?

MiguelCompany commented 2 years ago

@genevanmeter

Are there any that may be of use for diagnosing?

No. You can thrash them away.

ivanpauno commented 2 years ago

@MiguelCompany can this issue happen on Linux as well?

I have run into the problem of ros2 node list not showing all nodes many times.

MiguelCompany commented 2 years ago

@ivanpauno I think it may also happen. If it happens, try doing a rm /dev/shm/fastrtps*