fkie / multimaster_fkie

ROS stack with FKIE packages for multi-robot (discovering, synchronizing and management GUI)
BSD 3-Clause "New" or "Revised" License
271 stars 108 forks source link

[noetic] Multiple executables are found #161

Closed tkazik closed 1 year ago

tkazik commented 3 years ago

While starting master_discovery on rpi, node_manager throws the following error:

node_manager5

Somehow, it seems to complain about executables being both in devel and src. Thx for your insights!

atiderko commented 3 years ago

If you run in a terminal rosrun fkie_master_discovery master_discovery you should get the same warning. Node manager uses the ROS methods to find the executables. I suspect something is wrong with the environmental variables.

Starting other nodes is done via the daemon and there you can select the executable. master_discovery is started via SSH, because only then the host will be found and the daemon started. The SSH warnings/errors are not parsed and printed as an error. In this case it could be a notice because the first executable should be taken to start the master_discovery.

tkazik commented 3 years ago

Thx for you quick reply! So I just tried the following:

[INFO] [1627031993.454860]: Check the ROS Master[Hz]: 1
[INFO] [1627031993.459171]: Heart beat [Hz]: 0.02
[INFO] [1627031993.463920]: Active request after [sec]: 60
[INFO] [1627031993.467882]: Remove after [sec]: 300
[INFO] [1627031993.472256]: Robot hosts: []
[INFO] [1627031993.476035]: Approx. mininum avg. network load: 1.36 bytes/s
[INFO] [1627031993.497372]: Start RPC-XML Server at ('0.0.0.0', 11611)
[INFO] [1627031993.504416]: hide_nodes: []
[INFO] [1627031993.511610]: hide_topics: []
[INFO] [1627031993.518986]: hide_services: []
[INFO] [1627031993.522863]: Subscribe to parameter `/roslaunch/uris`
[INFO] [1627031993.671592]: Detected master discovery: http://192.168.1.35:11611
[INFO] [1627031993.782622]: Added master with ROS_MASTER_URI=http://P1G3:11311/

This is the output of printenv | grep ROS on the rpi:

ROS_VERSION=1
ROS_PYTHON_VERSION=3
ROS_PACKAGE_PATH=/home/robot/catkin_ws/src/multimaster_fkie/fkie_multimaster:/home/robot/catkin_ws/src/multimaster_fkie/fkie_multimaster_msgs:/home/robot/catkin_ws/src/multimaster_fkie/fkie_master_discovery:/home/robot/catkin_ws/src/multimaster_fkie/fkie_master_sync:/home/robot/catkin_ws/src/multimaster_fkie/fkie_node_manager_daemon:/home/robot/catkin_ws/src/multimaster_fkie/fkie_node_manager:/home/robot/catkin_ws/src/rpi_temp:/opt/ros/noetic/share
ROSLISP_PACKAGE_DIRECTORIES=/home/robot/catkin_ws/devel/share/common-lisp
ROS_ETC_DIR=/opt/ros/noetic/etc/ros
ROS_MASTER_URI=http://P1G3:11311
ROS_ROOT=/opt/ros/noetic/share/ros
ROS_DISTRO=noetic

Would you expect to see something else in the environmental variables?

atiderko commented 3 years ago

The environment variables look good to me.

What happens if you execute: ssh p1g3 'rosrun fkie_node_manager remote_nm.py --package fkie_master_discovery --node_type master_discovery --node_name /master_discovery'

tkazik commented 3 years ago

Looks like I get the same results (executable in devel and src):

tkazik@P1G3:~ $ ssh p1g3 'rosrun fkie_node_manager remote_nm.py --package fkie_master_discovery --node_type master_discovery --node_name /master_discovery'
tkazik@p1g3's password: 
Multiple executables are found! The first one was started! Exceutables:
['/home/tkazik/workspaces/tools/devel/lib/fkie_master_discovery/master_discovery', '/home/tkazik/workspaces/tools/src/multimaster_fkie/fkie_master_discovery/nodes/master_discovery']
Launch ROS Master in screen  ... /usr/bin/screen -c /home/tkazik/.config/ros.fkie//screen.cfg -O -L -Logfile /home/tkazik/.ros/log/_roscore--11311.log -s -/bin/bash -dmS _roscore--11311 roscore --port 11311
run on remote host: /usr/bin/screen -c /home/tkazik/.config/ros.fkie//screen.cfg -O -L -Logfile /home/tkazik/.ros/log/_master__discovery.log -s -/bin/bash -dmS _master__discovery   /home/tkazik/workspaces/tools/devel/lib/fkie_master_discovery/master_discovery

The thing that surprises me a bit: This is 'just' a warning and if I just click 'ignore' in node_manager, the first executable should be started and everything should be fine, right? Unfortunately this is not the case: After the exception is thrown, the top left of node_manager says: "ROS Network [disabled]". Can you reproduce this behavior?

On a side note: You might want to consider this. As python3 seems to be stricter than python2 wrt bytes/strings and thus replacing output = stdout.read() with output = stdout.read().decode() e.g. here (and other occasions) might make sense.

atiderko commented 3 years ago

Now check the .bashrc again. Do you have some source for ros before return statement and after?

Just out of interest, execute: ssh p1g3 'env | grep ROS'

"ROS Network" in node manager will be enabled only if a master_discovery node on localhost is detected. If you start a master_discovery on remote host the dock will stay disabled.

As for the "decode / encode", you are right. I would have to fix that ... when I have time ;-)

tkazik commented 3 years ago

I just have a single source for ros-instance in the .bashrc, i.e.:

source /opt/ros/noetic/setup.bash
source ~/workspaces/fkie_ws/devel/setup.bash
export ROS_MASTER_URI=http://P1G3:11311

Placing these lines before or after the return did not make a difference and I get the same error regarding two executables.

Sure, I created a new workspace containing only the multimaster_fkie package. Here are the ROS environment variables (Note: For the ros-cmd to work through ssh, the source for ros needs to be before the return statement in .bashrc, explanations here and here):

tkazik@P1G3:~/workspaces/fkie_ws $ ssh p1g3 'env | grep ROS'
ROS_VERSION=1
ROS_PYTHON_VERSION=3
ROS_PACKAGE_PATH=/home/tkazik/workspaces/fkie_ws/src/multimaster_fkie/fkie_multimaster:/home/tkazik/workspaces/fkie_ws/src/multimaster_fkie/fkie_multimaster_msgs:/home/tkazik/workspaces/fkie_ws/src/multimaster_fkie/fkie_master_discovery:/home/tkazik/workspaces/fkie_ws/src/multimaster_fkie/fkie_master_sync:/home/tkazik/workspaces/fkie_ws/src/multimaster_fkie/fkie_node_manager_daemon:/home/tkazik/workspaces/fkie_ws/src/multimaster_fkie/fkie_node_manager:/opt/ros/noetic/share
ROSLISP_PACKAGE_DIRECTORIES=/home/tkazik/workspaces/fkie_ws/devel/share/common-lisp
ROS_ETC_DIR=/opt/ros/noetic/etc/ros
ROS_MASTER_URI=http://P1G3:11311
ROS_ROOT=/opt/ros/noetic/share/ros
ROS_DISTRO=noetic

Does anything of that look fishy to you? Thx for your support!

tkazik commented 3 years ago

Here is a short snipped that shows how the ROS network is shut down after starting master_discovery on the RPI.

https://user-images.githubusercontent.com/15689124/126965366-041293d5-5e56-44e8-af53-d6921a4c3669.mp4

If I follow this guide, everything works from command line...which is why I believe that something within node_manager might not work as intended.

For the sake of completeness:

# Commands on `P1G3`:
roscore
rosrun fkie_master_discovery master_discovery _mcast_group:=224.0.0.1
rosrun fkie_master_sync master_sync
rostopic pub -r 1 /test std_msgs/Int32 "data: $(date '+%s')"

# Commands on `rpi`
rosrun fkie_master_discovery master_discovery _mcast_group:=224.0.0.1 __name:=master_discovery_rpi
rosrun fkie_master_sync master_sync __name:=master_sync_rpi
rostopic echo /test

# info on nodes
rosnode list 
# returns
# - /master_discovery
# - /master_discovery_rpi
# - /master_sync
# - /master_sync_rpi
# - /rosout
# - /rostopic_212693_1627293800717
atiderko commented 3 years ago

Hi, sorry for late reply, I'm on vacation.

  1. I compared the environment variables again. You wrote "This is the output of printenv | grep ROS on the rpi:"

  2. Do not rename master_discovery and master_sync nodes! You should run only one instance for each roscore. The problem in video could be a problem with ROS_MASTER_URI on rpi

tkazik commented 3 years ago

My bad, I guess I was following this guide a bit too blindly. Now:

node_manager_multiple_exe

This does not occur in ROS melodic and I suspect it might be related to some change that happened in noetic...

A hacky "workaround" would be to uncomment the lines here and here.

Anyway, enjoy your vacations!

atiderko commented 2 years ago

please reopen if the problem persists!

tkazik commented 2 years ago

@atiderko, thx a lot!

I am actually not so sure anymore if this is the correct way or "just" a work-around...to me, it feels more like node_manager is looking for executables to "broadly", i.e. it should only look in the devel space and not in the devel and src space. Would that make sense?

On a side note, thx also for fixing the string here in the last commit:

https://github.com/fkie/multimaster_fkie/blob/381eb97cf0aec87aca7e6f325c446415f6208a72/fkie_node_manager/src/fkie_node_manager/start_handler.py#L183

Generally, it would be nice if strings were handled with the sandwich model, e.g. see here or here for inspiration, so this line would become:

output = stdout.read()           # old/python2
output = stdout.read().decode()  # new/python3

Anyway, thx a lot and have a nice evening!

tkazik commented 2 years ago

@atiderko , actually I think we should probably reopen this issue (I am not able to reopen it):

This line finds multiple executables and on my dummy test sample, because roslib.packages.find_node(package, executable) finds to executables both in src and devel space, i.e.:

cmd: ['/home/user/workspaces/test_ws/devel/lib/asdf/myNode', '/home/user/workspaces/test_ws/src/asdf/nodes/myNode']

The reason for this is probably the setup.py file or catkin_python_setup() instructions. But I have to admit that I am not an expert and don't really know what the right approach is here.