ros2 / rosidl_python

rosidl support for Python
Apache License 2.0
20 stars 45 forks source link

Hides the assertions that checks the data types of the message fields. #194

Closed Voldivh closed 1 year ago

Voldivh commented 1 year ago

Context

This PR is related to the issue described in this discussion. After doing some benchmarks on simple publishers and subscribers on rclpy using different types of messages (Ex: Strings, Ints, Arrays, Custom messages with nested structures) and analyzing their behavior using flamegraphs, we noticed that a lot of the busy time of the nodes was being used during the creation of the messages object. Taking that into account, a possible area of improvement for this issue is by hiding all the assertions made on the fields of the messages under a user controlled flag.

Description

In summary, this PR hides all the assertions made at the time of construction of messages in rclpy. In order to give the user control whether to do the assertions (and lose performance due to this) or to not do the assertions, all the message constructors have a new argument called check_fields which default value is False. This way all the pre-existing code would notice this improvement without the need of doing any modification.

clalancette commented 1 year ago

CI:

clalancette commented 1 year ago

So after running some of the failed tests locally, it looks like there is another problem here.

For instance, if you run the tests for rqt_py_common, it complains about something like the following:

1: E       AssertionError: 'int64 a\nint64 b\nboolean _check_fields\n---\nint64 sum\nbo[17 chars]ds\n' != 'int64 a\nint64 b\n---\nint64 sum\n'
1: E         int64 a
1: E         int64 b
1: E       - boolean _check_fields
1: E         ---
1: E         int64 sum
1: E       - boolean _check_fields

It looks like somehow _check_fields is being encoded into the string output for a message, and that is causing several tests (in rqt_py_common, and ros2action at least) to fail. @Voldivh can you take a closer look?

Voldivh commented 1 year ago

It looks like somehow _check_fields is being encoded into the string output for a message, and that is causing several tests (in rqt_py_common, and ros2action at least) to fail. @Voldivh can you take a closer look?

Hmm Interesting, I'll take a look and see what I can do to solve it.

Voldivh commented 1 year ago

After a little research and an internal discussion. We found out that the reason why some packages such as rqt_py_common and ros2action are failing CI is because some API's are using the __slots__ attribute to retrieve the message's fields and field types instead of using the method get_fields_and_field_types(). I created a PR that should solve the problem.

clalancette commented 1 year ago

So with the fixes elsewhere, I think we are getting closer here.

However, I think we have another problem in this repository. In particular, we want the __repr__ of each message to only represent the fields of the message, and it currently uses the __slots__:

https://github.com/ros2/rosidl_python/blob/e6d72487a05b64792e437406058bd0f5740640c5/rosidl_generator_py/resource/_msg.py.em#L362-L386

What I'll suggest here is to just switch that for s, t in zip(self.__slots__, self.SLOT_TYPES): to for s, t in self._fields_and_field_types.items(): at https://github.com/ros2/rosidl_python/blob/e6d72487a05b64792e437406058bd0f5740640c5/rosidl_generator_py/resource/_msg.py.em#L367 . You'll also have to update https://github.com/ros2/rosidl_python/blob/e6d72487a05b64792e437406058bd0f5740640c5/rosidl_generator_py/resource/_msg.py.em#L385 to not strip off the first character, which was _ when we were using slots.

Voldivh commented 1 year ago

I see your point. I'll get right down to it. However, I think that we might actually encounter similar problems in other packages, is it worth that I do a little search and modifications or should we just leave it as it is as long as CI passes? @clalancette

clalancette commented 1 year ago

I see your point. I'll get right down to it. However, I think that we might actually encounter similar problems in other packages, is it worth that I do a little search and modifications or should we just leave it as it is as long as CI passes? @clalancette

It is worthwhile doing a search through the existing ROS 2 core and see if there are other uses of __slots__, yes. That way we can see if this problem is anywhere else.

clalancette commented 1 year ago

So I did a bit more digging into how __slots__ is used throughout the codebase.

To do this, I fetched all of the code that is currently released into rolling with the following:

$ mkdir ros2_rolling
$ cd ros2_rolling
$ rosinstall_generator ALL --rosdistro rolling --deps | vcs import

(this takes a while). I then grepped through all of the code to find uses of __slots__:

$ grep -rI "__slots__" *

There are a total of 122 uses of __slots__ throughout the codebase. Many of them are just regular Python code, just using __slots__ for their own purposes.

However, I did find a number of uses of __slots__ from ROS messages. Most of the usage of these is going to break in subtle ways if we put this PR in without doing something about it. Here is the list I found (ignoring rosidl_runtime_py since https://github.com/ros2/rosidl_runtime_py/pull/23 should fix that):

I think we should attempt to fix these before we put this in, otherwise things will start failing and it will be hard to tell why. @Voldivh can you start taking a look at this, particularly rqt_py_common, rqt_bag, rqt_plot, and rqt_publisher?

Voldivh commented 1 year ago

I think we should attempt to fix these before we put this in, otherwise things will start failing and it will be hard to tell why. @Voldivh can you start taking a look at this, particularly rqt_py_common, rqt_bag, rqt_plot, and rqt_publisher?

Yeah, after looking a little bit to the packages you mentioned, I can see that they will require some fixing. I'll start to work on the ones you commented.

Voldivh commented 1 year ago

Just finished creating the PRs that solve the issues with the packages:

clalancette commented 1 year ago

OK, so we now have the 4 in-core uses of slots fixed.

I think it is worthwhile to open new issues or PRs for rosbridge_suite, rospy_message_converter, and rqt_robot_monitor, telling them that this use is changing and how to fix it. Once that is done, we can go ahead and run CI here.

clalancette commented 1 year ago

CI:

clalancette commented 1 year ago

CI:

clalancette commented 1 year ago

@Voldivh OK, this is looking good to me with CI. Can you please open up a PR to https://github.com/ros2/ros2_documentation/blob/rolling/source/Releases/Release-Iron-Irwini.rst , saying that we changed this behavior? Once that is ready, I'll merge that and this one together.