Why are parameters like ROS_IP global?

StephanHasler commented 2 years ago

https://github.com/Unity-Technologies/ROS-TCP-Endpoint/blob/715a402ec09ce8095faabed915e3371638b4246a/src/ros_tcp_endpoint/server.py#L48 ROS_IP and ROS_TCP_PORT are global ROS parameters. So they are expected to be set directly as /ROS_IP. I think it would be cleaner if these parameters were privat/local. In this way the global ROS workspace would not be polluted. Other nodes might like to have a different ROS_IP and then they would be in conflict.

Maybe also the ROS in the names ROS_IP and ROS_TCP_PORT is misleading. Because it is the IP and PORT where somebody else from outside can connect to over TCP and not with ROS, or?

at669 commented 2 years ago

Hi, thanks for the post! I’ve created an internal ticket to investigate the type resolution for the networking parameters.

With regards to the parameter names, the ROS prefix is intended to differentiate the machine running the ROS endpoint (which has the ROS_IP and _PORT) from the machine running Unity, to clarify that the IP and port values being set on the "Unity side" are to be the values of the machine on the "ROS side."

[Ticket#: AIRO-1617]

github-actions[bot] commented 2 years ago

This issue has been marked stale because it has been open for 14 days with no activity. Please remove the stale label or comment on this issue, or the issue will be automatically closed in the next 14 days.

StephanHasler commented 2 years ago

Is there some progress on the internal ticket?

at669 commented 2 years ago

Hi, sorry for the delay--we'll comment in the thread when there's an update. If you have a workaround or have made these changes locally, please feel free to open a pull request for changing the name resolution.

mrpropellers commented 2 years ago

Hi @StephanHasler, to better understand what fix is needed here, could you elaborate on what issue you are seeing that could be rectified by ensuring ROS_IP is locally namespaced? Are you trying to run multiple instances of ros_tcp_endpoint on a single machine over multiple network adapters, or seeing collisions with locally networked endpoint instances? I want to ensure that the fix we implement solves the problem you are seeing.

StephanHasler commented 2 years ago

Hi @mrpropellers, we are running ROS distributed over several machines/rooms in our institute with a single ROS master. Currently we have only one instance of ros_tcp_endpoint running on one of that machines, which connects over a different network adapter than ROS is using. So currently it is not a problem. But I can foresee that we would have a ros_tc_endpoint also on other hosts/rooms of that ROS network. Then each ros_tc_endpoint would have to set a host-specific ROS_IP. We could also switch to a multi-master ROS setup then, but it makes things more complex.

Our system is also rather big with many different ROS packages. Therefore, in general, it is a good strategy to keep the global ROS parameter namespace clean, as a conflict is more probable.

mrpropellers commented 2 years ago

Ok, so it seems like all that's needed here is to eliminate the preceding "/" character? I can rename the variable to something that more accurately reflects its intended use, as well.

StephanHasler commented 2 years ago

I would make it a private parameter like ~ROS_IP.

Given two ros_tc_endpoint nodes in a namespace /foo (or the global one /) /foo/endpoint_1 /foo/endpoint_2 the resolved parameter names would be in the private case (~ROS_IP): /foo/endpoint_1/ROS_IP /foo/endpoint_2/ROS_IP and in the relative case (ROS_IP) /foo/ROS_IP /foo/ROS_IP At least this is what I conclude from: http://wiki.ros.org/Names So there would be again a conflict for the relative case, unless you put the nodes into different namespaces.

I think you would use a relative parameter when it must be shared among several nodes in a namespace. If the endpoint is the only node that needs to know about the particular ROS_IP then a private parameter is a good choice.

StephanHasler commented 2 years ago

Regarding the naming, I would avoid using ROS in the names of the parameter. Also it seems much more common to have lowercase parameter names, because it is more like a variable. I would name the parameters like the variables in the code:

self.tcp_ip = rospy.get_param("~tcp_ip")
self.tcp_port = rospy.get_param("~tcp_port", 10000)

Because then the overall naming is very consistent:

Also following: http://wiki.ros.org/ROS/EnvironmentVariables ROS_IP is somehow a reserved environment variable name.

StephanHasler commented 2 years ago

A problem might also be how you initialize the node in the default_server_endpoint.py. It is quite uncommon to set a node's name by a parameter (here /TCP_NODE_NAME). And I guess no one ever used that parameter, as it is also not used in the endpoint.launch. Also the launch file assigns the name 'server_endpoint' to the node while the default /TCP_NODE_NAME is TCPServer. This is quite confusing and also might conflict with the ROS namespacing.

A more common way would be:

def main():
    # Start the TCP server.
    rospy.init_node("tcp_server", anonymous=True)
    tcp_server = TcpServer(rospy.get_name())
    tcp_server.start()
    rospy.spin()

So 'tcp_server' would be the default name of the node which can be overridden by the launch file. (or tcp_bridge might also be a better name than server_endpoint)

Furthermore TcpServer does not make use of the node's name, except for storing it. I would remove node_name from the TcpServer code.

So this would give:

def main():
    # Start the TCP server.
    rospy.init_node("tcp_server", anonymous=True)
    tcp_server = TcpServer()
    tcp_server.start()
    rospy.spin()

StephanHasler commented 2 years ago

The endpoint.launch would look like:

<launch>
    <arg name="tcp_ip" default="127.0.0.1"/>
    <arg name="tcp_port" default="10000"/>

    <node name="tcp_server" pkg="ros_tcp_endpoint" type="default_server_endpoint.py" args="--wait" output="screen" respawn="true">
        <param name="tcp_ip" type="string" value="$(arg tcp_ip)"/>
        <param name="tcp_port" type="int" value="$(arg tcp_port)"/>
    </node>
</launch>

What is the effect of the --wait? I know that it is a command line parameter for roslaunch. I think when starting a node inside a launch file, args="--wait" only makes sense, if the node itself would parse sys.args, which I could not find.

I use the respawn="true", when I have buggy code that randomly crashes. For good code, it is usually not needed, or even hindering as you might not notice a crash.

The config/params.yaml is not needed anymore.

StephanHasler commented 2 years ago

While checking in how far node_name is used inside TcpServer, I stumbled over test_server.py. There are many lines like:

mock_import_module.assert_called_once

Pycharm tells me that these lines would have no effect. And I guess it is right. It should be a function call:

mock_import_module.assert_called_once()

There is one line where adding () breaks the test:

mock_sys_modules.assert_called_once

But I think this is a further bug in the test code, as resolve_message_name() does not call sys.modules() but does a: sys.modules[] So it should be:

mock_sys_modules.__getitem__.assert_called_once()

Also test_thread_pauser.py has the same problem of missing ().

I know I should better make a new issue for this :)

StephanHasler commented 2 years ago

I can also send you the diff with all my changes or make a pull request. For this I need to know on which commit I should base my diff. Currently I'm using origin/main. But I guess this is correct. I tested that the unittests succeed and that the endpoint.launch correctly sets tcp_ip and tcp_port.

mrpropellers commented 2 years ago

Yes, if you have a fix already that supports your use case feel free to submit a pull request! There is always more work to be done on our code, but it needs to be prioritized against all our other feature and support work, so naming issues and other low-impact fixes tend to stay un-resolved until we have proof they're causing problems. You'll need to make the PR against dev-ros or dev-ros2 depending on which version of ROS you are using. I can port the relevant changes to the other branch once the initial merge is resolved.

Unity-Technologies / ROS-TCP-Endpoint

Why are parameters like ROS_IP global? #115