Open dhood opened 4 weeks ago
update: on Jazzy docker image the nodes don't give the error messages, and messages are exchanged. That's promising.
However, I can't manage to externally modify the chatter qos, with e.g.
<data_writer profile_name="/chatter">
<qos>
<durability>
<kind>TRANSIENT_LOCAL</kind>
</durability>
<reliability>
<kind>BEST_EFFORT</kind>
</reliability>
<publishMode>
<kind>SYNCHRONOUS</kind>
</publishMode>
</qos>
</data_writer>
I still get
root@c585c5bd183e:/# ros2 topic info /chatter --verbose
Type: std_msgs/msg/String
Publisher count: 1
Node name: talker
Node namespace: /
Topic type: std_msgs/msg/String
Topic type hash: RIHS01_df668c740482bbd48fb39d76a70dfd4bd59db1288021743503259e948f6b1a18
Endpoint type: PUBLISHER
GID: 01.0f.eb.7d.40.02.36.77.00.00.00.00.00.00.14.03
QoS profile:
Reliability: RELIABLE
History (Depth): UNKNOWN
Durability: VOLATILE
Lifespan: Infinite
Deadline: Infinite
Liveliness: AUTOMATIC
Liveliness lease duration: Infinite
Subscription count: 0
Are there any unit tests I can copy off?
You can refer to https://github.com/ros2/rmw_fastrtps/?tab=readme-ov-file#ros-2-middleware-implementation-for-eprosimas-fast-dds.
I tried it under the Jazzy version and got the same result as you. I'm not sure if there's an error in my configuration file. I will investigate further.
After checking the code, user-defined QoS by the code will override the QoS settings read from the configuration file.
This behavior seems a bit strange.
qos_policies
is from codes. writer_qos
is from configuration file.
get_datawriter_qos()
-> fill_data_entity_qos_from_profile()
-> fill_entity_qos_from_profile()
template<typename DDSEntityQos>
bool fill_entity_qos_from_profile(
const rmw_qos_profile_t & qos_policies,
DDSEntityQos & entity_qos)
{
...
switch (qos_policies.durability) {
case RMW_QOS_POLICY_DURABILITY_TRANSIENT_LOCAL:
entity_qos.durability().kind = eprosima::fastdds::dds::TRANSIENT_LOCAL_DURABILITY_QOS;
break;
case RMW_QOS_POLICY_DURABILITY_VOLATILE:
entity_qos.durability().kind = eprosima::fastdds::dds::VOLATILE_DURABILITY_QOS;
break;
case RMW_QOS_POLICY_DURABILITY_SYSTEM_DEFAULT:
break;
default:
RMW_SET_ERROR_MSG("Unknown QoS durability policy");
return false;
}
switch (qos_policies.reliability) {
case RMW_QOS_POLICY_RELIABILITY_BEST_EFFORT:
entity_qos.reliability().kind = eprosima::fastdds::dds::BEST_EFFORT_RELIABILITY_QOS;
break;
case RMW_QOS_POLICY_RELIABILITY_RELIABLE:
entity_qos.reliability().kind = eprosima::fastdds::dds::RELIABLE_RELIABILITY_QOS;
break;
case RMW_QOS_POLICY_RELIABILITY_SYSTEM_DEFAULT:
break;
default:
RMW_SET_ERROR_MSG("Unknown QoS reliability policy");
return false;
}
...
}
After checking the code, user-defined QoS by the code will override the QoS settings read from the configuration file.
This is the intended behaviour and it is documented here. From that section:
ROS 2 QoS contained in rmw_qos_profile_t are always honored, unless set to _SYSTEM_DEFAULT. In that case, XML values, or Fast DDS default values in the absence of XML ones, are applied. Setting any QoS in rmw_qos_profile_t to something other than _SYSTEM_DEFAULT entails that specifying it via XML files has no effect, since they do not override what was used to create the publisher, subscription, service, or client.
because FastDDS is the only vendor that I see with Best Effort as System Default.
Which Qos profile are you using? Mind that Fast DDS and rmw_fastrtps are two different things (rmw_fastrtps being an adapter to use Fast DDS in ROS 2). In Fast DDS, as per DDS specification, the default reliability for writers in RELIABLE, whereas the default for readers in BEST EFFORT; I believe this is the case for all DDS vendors, as that is what the DDS spec requieres.
When talking about rmw_fastrtps, the Qos are received from upper layers. In ROS 2, different profiles are defined which you can see here. Furthermore, each Qos has a SYSTEM_DEFAULT possible value which tells rmw_fastrtps to not change anything an let Fast DDS set the policies. It is with this configuration that the XML values for the ROS 2 defined Qos are applied.
I haven't had the chance to run an experiment now, and it's been a while since I last did it. I'd appreciate if you could set all the Qos to SYTEM_DEFAULT and see whether the XML values are taken into consideration.
@EduPonz
This is the intended behaviour and it is documented here.
Thanks. Yes, The document describes this situation.
The talker from Demo_node_cpp used rmw_qos_profile_default. Reliability is set to RMW_QOS_POLICY_RELIABILITY_RELIABLE. (Default value is RMW_QOS_POLICY_RELIABILITY_SYSTEM_DEFAULT) Durability is set to RMW_QOS_POLICY_DURABILITY_VOLATILE. (Default value is RMW_QOS_POLICY_DURABILITY_SYSTEM_DEFAULT) They don't use default value. So, corresponding QoS settings from XML configuration file are ignored.
static const rmw_qos_profile_t rmw_qos_profile_default =
{
RMW_QOS_POLICY_HISTORY_KEEP_LAST,
10,
RMW_QOS_POLICY_RELIABILITY_RELIABLE,
RMW_QOS_POLICY_DURABILITY_VOLATILE,
RMW_QOS_DEADLINE_DEFAULT,
RMW_QOS_LIFESPAN_DEFAULT,
RMW_QOS_POLICY_LIVELINESS_SYSTEM_DEFAULT,
RMW_QOS_LIVELINESS_LEASE_DURATION_DEFAULT,
false
};
@Barry-Xu-2018 There is a rmw_qos_profile_system_default (note the additional _system
part in the name)
@Barry-Xu-2018 There is a rmw_qos_profile_system_default (note the additional
_system
part in the name)
Thanks.
Talker and listener in demo_nodes_cpp doesn't use rmw_qos_profile_system_default. So we cannot directly use them to verify QoS setting from XML configuration. @dhood
After checking the code, user-defined QoS by the code will override the QoS settings read from the configuration file.
This is the intended behaviour and it is documented here. From that section:
besides, this is consistent behavior with rmw_connextdds as well.
if the application allows QoS override options to be configurable, we can override the QoS as following?
root@tomoyafujita:~/ros2_ws/colcon_ws# ros2 run quality_of_service_demo_cpp qos_overrides_talker
[INFO] [1718407800.397783163] [qos_overrides_talker]: Publishing an image, sent at [1718407800.397773]
[INFO] [1718407803.377167068] [qos_overrides_talker]: Publishing an image, sent at [1718407803.377161]
...
root@tomoyafujita:~/ros2_ws/colcon_ws# ros2 topic info -v /qos_overrides_chatter
Type: sensor_msgs/msg/Image
Publisher count: 1
Node name: qos_overrides_talker
Node namespace: /
Topic type: sensor_msgs/msg/Image
Topic type hash: RIHS01_d31d41a9a4c4bc8eae9be757b0beed306564f7526c88ea6a4588fb9582527d47
Endpoint type: PUBLISHER
GID: 01.0f.f5.91.44.9d.33.ba.00.00.00.00.00.00.14.03
QoS profile:
Reliability: RELIABLE
History (Depth): UNKNOWN
Durability: VOLATILE
Lifespan: Infinite
Deadline: Infinite
Liveliness: AUTOMATIC
Liveliness lease duration: Infinite
Subscription count: 0
this can be overridden,
root@tomoyafujita:~/ros2_ws/colcon_ws# ros2 run quality_of_service_demo_cpp qos_overrides_talker --ros-args --params-file ./example_qos_overrides.yaml
[INFO] [1718407788.200066890] [qos_overrides_talker]: Publishing an image, sent at [1718407788.200057]
[INFO] [1718407791.209793863] [qos_overrides_talker]: Publishing an image, sent at [1718407791.209787]
...
root@tomoyafujita:~/ros2_ws/colcon_ws# ros2 topic info -v /qos_overrides_chatter
Type: sensor_msgs/msg/Image
Publisher count: 1
Node name: qos_overrides_talker
Node namespace: /
Topic type: sensor_msgs/msg/Image
Topic type hash: RIHS01_d31d41a9a4c4bc8eae9be757b0beed306564f7526c88ea6a4588fb9582527d47
Endpoint type: PUBLISHER
GID: 01.0f.f5.91.2f.9d.e5.18.00.00.00.00.00.00.14.03
QoS profile:
Reliability: BEST_EFFORT
History (Depth): UNKNOWN
Durability: TRANSIENT_LOCAL
Lifespan: Infinite
Deadline: Infinite
Liveliness: AUTOMATIC
Liveliness lease duration: Infinite
Subscription count: 0
qos_overrides_talker: # node name
ros__parameters: # All parameters are nested in this key
qos_overrides: # Parameter group for all qos overrides
/qos_overrides_chatter: # Parameter group for the topic
publisher: # Profile for publishers in the topic
# publisher_my_id: # Profile for publishers in the topic with id="my_id"
reliability: best_effort
durability: transient_local
it really depends on the application and use case.
see example https://github.com/ros2/demos/tree/rolling/quality_of_service_demo#qos-overrides
@dhood what do you think?
btw, i think we should add some tutorial and explanation about https://github.com/ros2/demos/tree/rolling/quality_of_service_demo#qos-overrides to https://github.com/ros2/ros2_documentation
btw, i think we should add some tutorial and explanation about https://github.com/ros2/demos/tree/rolling/quality_of_service_demo#qos-overrides to https://github.com/ros2/ros2_documentation
https://github.com/ros2/ros2_documentation/issues/4549 is follow-up for documentation.
it's great that there's a demo for testing the QoS pass through, that's what should probably be referenced in the fastrtps tutorial :slightly_smiling_face: Additional documentation will def be useful, but my concerns and feedback are more with fastrtps docs/behaviour than ros2 docs. To clarify, I already understand that it is not supposed to work unless the System Default is explicitly set, although I bet the pointers will be useful for someone who stumbles upon this thread someday!
I'll summarise the issues as I see them, because I'm not sure that they've been addressed by the thread so far:
1 combined with 3 means that I cannot change the behaviour of system-default QoS topics within moveit back to using reliable qos, which is needed for my system (and any other system using moveit + fastrtps + humble that has the same errors)
The good news for 1 is that somewhere between humble and Jazzy the behaviour has been fixed, I think
@fujitatomoya I might have missed a point you were making about QoS overrides from parameters, it's a feature I haven't seen before so I mistook it as xml overrides. Can the demo you linked be used to manipulate topics using system default qos, eg this topic inside moveit? I think it can only be used to configure nodes I've created myself with those qos_overriding_options right? It's not a builtin behaviour of general ros2 nodes that can be used to configure library behaviour?
@dhood
Can the demo you linked be used to manipulate topics using system default qos, eg this topic inside moveit? I think it can only be used to configure nodes I've created myself with those qos_overriding_options right?
your understanding is true. QoS cannot be overridden via these parameters unless qos_overriding_options are configured so. (unfortunately this is something we CANNOT do at this moment, i think.)
so after all, i believe there is no way to reconfigure the QoS for the node with non-system-default QoS is configured.
where i can see, we got a few things to consider,
qos_overriding_options
should be opt-out
but opt-in
? i guess the expectation was that application need to set specific QoS and if it allows to user to reconfigure, then explicitly opt-in
the qos_overriding_options
. but QoS can be dependent on the use case of the nodes, e.g camera node's QoS reliability should be set and configured with the use cases, such as transport live streaming (probably best effort) or storing streaming data (probably reliable).the qos passthrough tutorial from fastrtps docs does not exercise the feature it claims to (not on humble or jazzy), because it references an executable which does not use the XML qos
@dhood where exactly are you referring to? just checking i am on the same page with you.
deprecate system default QoS
, and use ROS 2 default QoS
instead? so that switching RMW does not change the QoS behavior at all, that can be more expected behavior for user application. besides, RMW can check if user application specifies the QoS comparing against ROS 2 default QoS
.
RMW dependent behavior to override the ROS 2 QoS with documentation. it is up to RMW implementation how implementation overrides the incoming QoS from ROS 2, but there needs to be a good documentation in rmw repository how user can control the QoS configuration with specific RMW implementation.
i might be missing something like history and basic concept with the current design, so i would like to hear from other developers.
thanks,
I'm compelled to answer here
- bug: on humble (which is still LTS), the use of RMW_FASTRTPS_USE_QOS_FROM_XML renders the talker-listener demo inoperable (I am using the demo as a proxy for moveit which I also found does not work on my system with xml qos). This prevents configuring the system default on humble.
Again, this is not a bug. As explained here, when setting RMW_FASTRTPS_USE_QOS_FROM_XML
, what you're doing is allowing the XML to override history memory policy
, publication mode
, and datasharing
. That means that in the absence of XML, those Qos take the Fast DDS default value, which may be different from the ROS 2 set value when RMW_FASTRTPS_USE_QOS_FROM_XML
is not set.
In particular, in Humble, history memory policy
is set to PREALLOCATED_MEMORY_MODE
when RMW_FASTRTPS_USE_QOS_FROM_XML
is set (unless changed in the XML), and to PREALLOCATED_WITH_REALLOC_MEMORY_MODE
when it is not. If you export RMW_FASTRTPS_USE_QOS_FROM_XML
and don't change this mode in the XML, the node creation fails because the depth of the history for the ros_discovery_info
topic is 1000 and in PREALLOCATED_MEMORY_MODE
the default preallocated is 500. This is why all our examples set the history memory policy; you could use something like this:
<profiles xmlns="http://www.eprosima.com/XMLSchemas/fastRTPS_Profiles">
<!-- Default publisher profile -->
<data_writer profile_name="default publisher profile" is_default_profile="true">
<historyMemoryPolicy>DYNAMIC</historyMemoryPolicy>
</data_writer>
<!-- Default subscriber profile -->
<data_reader profile_name="default subscriber profile" is_default_profile="true">
<historyMemoryPolicy>DYNAMIC</historyMemoryPolicy>
</data_reader>
</profiles>
The most important thing here is that in order to change any Qos other than history memory policy
, publication mode
, or datasharing
, you DO NOT NEED to export RMW_FASTRTPS_USE_QOS_FROM_XML
, you only need FASTRTPS_DEFAULT_PROFILES_FILE
.
Although I understand that this is all somehow convoluted, I thing a careful read of the documentation should be enough.
The reason why it works on Iron and Jazzy is that the default history memory policy
changed to PREALLOCATED_WITH_REALLOC_MEMORY_MODE
within Fast DDS. We'll open a PR to reflect that on this repository's README.
2. bug: the qos passthrough tutorial from fastrtps docs does not exercise the feature it claims to (not on humble or jazzy), because it references an executable which does not use the XML qos.
I'm not sure what you mean with it references an executable which does not use the XML qos
. None of the ROS 2 executables use the XML directly, you need to reference to it in an env var for Fast DDS to pick it up (see for example the docs for the Fast DDS version corresponding to Humble).
I think what you mean is that with that example you cannot change the ROS 2 fixed Qos, is that correct? In that case, I'd say that the ROS 2 fixed Qos is simple a subset of all the available Qos configurable through XML. In that sense the example is very much valid for configuring almost every Fast DDS Qos expect for 7 of them (the ones exposed in ROS 2).
3. surprise: the system default of best effort with fastdds is behaviourally different to the system default of reliable from cyclone dds (it might be as you said, that fastrtps is implementing the proper one! but as a user the CHANGE of qos settings when changing rmw implementation despite not touching any code was a big big surprise to me which AFAIK is not documented anywhere). The outcome might be docs, perhaps even docs of the libraries leveraging system-default, but tbh my first thought was more along the lines that the system default ought to be reliable or at least consistent between rmw implementations
I can partially agree with what is said here:
from a quick glimpse it sounds like I've missed the distinction between FASTRTPS_DEFAULT_PROFILES_FILE and RMW_FASTRTPS_USE_QOS_FROM_XML - this response looks very helpful, thanks @EduPonz Let me read in detail and get back to you :+1:
Bug report
Hi team-
I am unable to override the default QoS which is important to replace best effort publishers with reliable ones, anywhere that ros libraries are using system default QoS, e.g. MoveIt, ros_control....
Perhaps the tutorial is out of date or I am using the wrong one, so I will focus the bug report on the demo nodes, but changing System Default QoS is my actual goal.
FYI, the ability to do this is critical to fixing regressions in our system after changing from cyclone to fastdds, because FastDDS is the only vendor that I see with Best Effort as System Default. (Took me a while to realise this was the culprit).
Thanks in advance!
Required Info:
Steps to reproduce issue
Following the tutorial from https://fast-dds.docs.eprosima.com/en/v2.6.7/fastdds/ros2/ros2_configure.html#example inside the latest ros:humble docker image with fastrtps-2.6.7 gives error outputs, and no messages are received. Setting RMW_FASTRTPS_USE_QOS_FROM_XML to 0, they communicate.