Open dk-teknologisk-lag opened 5 months ago
CC: @Barry-Xu-2018
I have opened a discussion here as well as I could reproduce it using their helloworld example with a bit of modifications, fyi - https://github.com/eProsima/Fast-DDS/discussions/4276
I can reproduce this issue. But on host (Not container), message_lost_callback is never called. If QoS is set as Reliable, there is no problem. This issue is unrelated to the segment size of shared memory.
In FastDDS shared memory example, it also uses Reliable QoS. I simply modify it (topic_qos) to BEST_EFFORT.
I didn't find this issue (The same Fastdds version 2.11.2). Maybe the size of message is small (only 1M).
I have update the example to linked to in the other discussion, but after changing it to BEST_EFFORT and a message size of 10MB I get the same behavior:
I just tried to change to RELIABLE and here I also get dropped messages - also if I change back to just sending "Hello world" though a lot fewer:
I noticed earlier that when I was using the RELIABLE QoS it didn't printed the lost messages, but inspecting the message IDs, I could see that there was gaps. In the above image, it jumps from 10195 to 10214, but it could probably be a limit of the console, that prints out of "order". But still, there are jumps in the message ids.
How come it transfers 10241024 bytes and not 2 1024 * 1024 which the segment size is set to?
Also, whats the difference of the QoS topic vs DataWriter - does they both require the same?
How come it transfers 10241024 bytes and not 2 1024 * 1024 which the segment size is set to?
Also, whats the difference of the QoS topic vs DataWriter - does they both require the same?
I figured it was the buffer size of data, rather than the size of the segment nor string. Got it working with a string of 10MB.
So, if I increase the segment size to 10 1024 1024, so it can hold an entire message in shared memory, I can run at ~1000 hz~ about 3-500 hz, even though the sleep time is set to 1ms, but there are seemingly no package loss, sending 10MB messages.
I guess its the entire HelloWorld data struct that get copied, so I should allocate for 11MB + 4 bytes, since it has its data array of chars, consuming 1024*1024 bytes, and its uint32_t m_index field?
Can I set this using a XML file? Force it to not use builtin transport, but a specific shared memory with larger segment size?
Can I set this using a XML file? Force it to not use builtin transport, but a specific shared memory with larger segment size?
Do you want to test it on ros2 environment ?
I have not used XML to configure transport on ROS2 before. But I think you can refer to section 6.4.3 in https://fast-dds.docs.eprosima.com/en/latest/fastdds/transport/shared_memory/shared_memory.html and prepare XML which was described in https://github.com/ros2/rmw_fastrtps/blob/rolling/README.md.
BTW, there is an easy way. Modify segment size at
auto shm_transport =
std::make_shared<eprosima::fastdds::rtps::SharedMemTransportDescriptor>();
shm_transport->segment_size(xxxxxx); // <== change the size of segment
domainParticipantQos.transport().user_transports.push_back(shm_transport);
And only rebuild rmw_fastrtps package.
Currently I'm using the binary packages installation, so I would avoid having to deploy a custom build rmw_fastrtps package.
Currently tried with:
<?xml version="1.0" encoding="UTF-8" ?>
<profiles xmlns="http://www.eprosima.com/XMLSchemas/fastRTPS_Profiles">
<transport_descriptors>
<!-- Create a descriptor for the new transport -->
<transport_descriptor>
<transport_id>shm_transport_only</transport_id>
<type>SHM</type>
<segment_size>12582912</segment_size>
</transport_descriptor>
</transport_descriptors>
<participant profile_name="DisableBuiltinTransportsParticipant">
<rtps>
<!-- Link the Transport Layer to the Participant -->
<userTransports>
<transport_id>shm_transport_only</transport_id>
</userTransports>
<useBuiltinTransports>false</useBuiltinTransports>
</rtps>
</participant>
</profiles>
It doesn't complain about, whereas if I tried with segmentSize it did. But it doesn't seem to have an effect (commented out the shared memory setup in the HelloWorldsharedMem example.
Messages get dropped when larger than 0.5MB - using shared memory - QoS is Best_effort
this is expected. either shared memory or not, setting Best Effort
means there is always the possibility to drop the message.
this is Not bounded data type, which cannot use LoanedMessage nor Data Sharing Delivery.
see also https://github.com/eProsima/Fast-DDS/discussions/4276
@dk-teknologisk-lag after all, i suggest that you can try with LoanedMessage, message data type must be bounded. (and underneath, rmw_fastrtps
will use Data Sharing Delivery to achieve zero copy data sharing.)
here is the demo code, https://github.com/ros2/demos/blob/rolling/demo_nodes_cpp/src/topics/talker_loaned_message.cpp
Currently I'm using the binary packages installation, so I would avoid having to deploy a custom build rmw_fastrtps package.
Currently tried with:
<?xml version="1.0" encoding="UTF-8" ?> <profiles xmlns="http://www.eprosima.com/XMLSchemas/fastRTPS_Profiles"> <transport_descriptors> <!-- Create a descriptor for the new transport --> <transport_descriptor> <transport_id>shm_transport_only</transport_id> <type>SHM</type> <segment_size>12582912</segment_size> </transport_descriptor> </transport_descriptors> <participant profile_name="DisableBuiltinTransportsParticipant"> <rtps> <!-- Link the Transport Layer to the Participant --> <userTransports> <transport_id>shm_transport_only</transport_id> </userTransports> <useBuiltinTransports>false</useBuiltinTransports> </rtps> </participant> </profiles>
It doesn't complain about, whereas if I tried with segmentSize it did. But it doesn't seem to have an effect (commented out the shared memory setup in the HelloWorldsharedMem example.
I'm afraid your participant profile is missing the is_default_profile="true"
attribute, see for instance here.
Messages get dropped when larger than 0.5MB - using shared memory - QoS is Best_effort
this is expected. either shared memory or not, setting
Best Effort
means there is always the possibility to drop the message.this is Not bounded data type, which cannot use LoanedMessage nor Data Sharing Delivery.
see also eProsima/Fast-DDS#4276
@dk-teknologisk-lag after all, i suggest that you can try with LoanedMessage, message data type must be bounded. (and underneath,
rmw_fastrtps
will use Data Sharing Delivery to achieve zero copy data sharing.)here is the demo code, https://github.com/ros2/demos/blob/rolling/demo_nodes_cpp/src/topics/talker_loaned_message.cpp
Yeah, I understand that. But since sending from an actual sensor to PC can run with full 20Hz, with a bit more compressed point cloud format, resulting in 16MB/s for ie. an ouster OS1 lidar, it seems horrible if we can't get 20Hz in IPC out of the box. But as I experienced, increasing the segment_size, ie the shared memory buffer seems to alleviate the dropped messages.
Yes, its unbound type and hence limited to the shared memory feature and not loaned messages. That could for sure be interesting to look into, but that would require a change in the Ouster driver itself, which is a bit out of scope for our current project.
If we get into cpu overload or timing issues for lidar odometry or something similiar we might try to use the loaned message api.
I'm afraid your participant profile is missing the
is_default_profile="true"
attribute, see for instance here.
Think I did try that as well, currently debugging to figure out when and how the xml files are parsed. But if the is_default_profile, the... yeah, default profiles get set to those values?
So when this is executed: https://github.com/ros2/rmw_fastrtps/blob/4d0be32e6c455edbf708003dffb67b11d512c5a6/rmw_fastrtps_shared_cpp/src/participant.cpp#L163
I should get the xml default values here?
Tried to get it working with the modified examples (only the HelloWorldSharedMem) from here: https://github.com/eProsima/Fast-DDS/compare/master...dk-teknologisk-lag:Fast-DDS:bestefforthelloworld
But looking more closely, it doesn't seem to use any default QoS, but create its own - or should it work here as well?
But thanks for the suggestion, will try again tomorrow.
<?xml version="1.0" encoding="UTF-8"?>
<dds xmlns="http://www.eprosima.com/XMLSchemas/fastRTPS_Profiles">
<profiles>
<transport_descriptors>
<!-- Create a descriptor for the new transport -->
<transport_descriptor>
<transport_id>shm_transport</transport_id>
<type>SHM</type>
<segment_size>10485760</segment_size>
</transport_descriptor>
</transport_descriptors>
<participant profile_name="SHMParticipant" is_default_profile="true">
<rtps>
<!-- Link the Transport Layer to the Participant -->
<userTransports>
<transport_id>shm_transport</transport_id>
</userTransports>
</rtps>
</participant>
</profiles>
</dds>
Using this configuration works. It can significantly reduce the packet loss rate. Even with an increased segment size (test 30M), there is still the phenomenon of packet loss.
RMW_FASTRTPS_USE_QOS_FROM_XML=1 FASTRTPS_DEFAULT_PROFILES_FILE=my_config.xml ros2 run cpp_pubsub talker --ros-args -p freq:=10 -p bytesize:=10000000
RMW_FASTRTPS_USE_QOS_FROM_XML=1 FASTRTPS_DEFAULT_PROFILES_FILE=pub_sub_config.xml ros2 run cpp_pubsub listener
<?xml version="1.0" encoding="UTF-8"?> <dds xmlns="http://www.eprosima.com/XMLSchemas/fastRTPS_Profiles"> <profiles> <transport_descriptors> <!-- Create a descriptor for the new transport --> <transport_descriptor> <transport_id>shm_transport</transport_id> <type>SHM</type> <segment_size>10485760</segment_size> </transport_descriptor> </transport_descriptors> <participant profile_name="SHMParticipant" is_default_profile="true"> <rtps> <!-- Link the Transport Layer to the Participant --> <userTransports> <transport_id>shm_transport</transport_id> </userTransports> </rtps> </participant> </profiles> </dds>
Using this configuration works. It can significantly reduce the packet loss rate. Even with an increased segment size (test 30M), there is still the phenomenon of packet loss.
RMW_FASTRTPS_USE_QOS_FROM_XML=1 FASTRTPS_DEFAULT_PROFILES_FILE=my_config.xml ros2 run cpp_pubsub talker --ros-args -p freq:=10 -p bytesize:=10000000
RMW_FASTRTPS_USE_QOS_FROM_XML=1 FASTRTPS_DEFAULT_PROFILES_FILE=pub_sub_config.xml ros2 run cpp_pubsub listener
It seems to somewhat work, yes. But it seems to add an additional buffer - ie. take a look at the screenshot below:
The red square marks when I launched with xml file you provided. It creates a buffer of 0.5MB and one with 10.5MB.
The blue is launched with xml, but commented the segmet_size, which seems to create a default sized buffer, ie. there are two of 0.5MB
The green is when launched without a xml file, which then just create a single buffer of the 0.5MB.
So it seems it doesn't use the supplied buffer from XML and thats probably why we still see the package loss.
Hi @dk-teknologisk-lag,
The second buffer is there because you did not disable to builtin SHM transport, so you're adding a second one. Please try with the following:
<?xml version="1.0" encoding="UTF-8"?>
<dds xmlns="http://www.eprosima.com/XMLSchemas/fastRTPS_Profiles">
<profiles>
<transport_descriptors>
<!-- Create a descriptor for the new transport -->
<transport_descriptor>
<transport_id>shm_transport</transport_id>
<type>SHM</type>
<segment_size>10485760</segment_size>
</transport_descriptor>
</transport_descriptors>
<participant profile_name="SHMParticipant" is_default_profile="true">
<rtps>
<!-- Link the Transport Layer to the Participant -->
<userTransports>
<transport_id>shm_transport</transport_id>
</userTransports>
<useBuiltinTransports>false</useBuiltinTransports>
</rtps>
</participant>
</profiles>
</dds>
I don't even seem to be able to disable SHM transport, ie. like this (borrowed from https://github.com/eProsima/Fast-DDS/issues/2287):
<?xml version="1.0" encoding="UTF-8"?>
<dds xmlns="http://www.eprosima.com/XMLSchemas/fastRTPS_Profiles">
<profiles>
<transport_descriptors>
<transport_descriptor>
<transport_id>udp_transport</transport_id>
<type>UDPv4</type>
</transport_descriptor>
</transport_descriptors>
<participant profile_name="/topic">
<rtps>
<userTransports>
<transport_id>udp_transport</transport_id>
</userTransports>
<useBuiltinTransports>false</useBuiltinTransports>
</rtps>
</participant>
</profiles>
</dds>
Hi @dk-teknologisk-lag,
The second buffer is there because you did not disable to builtin SHM transport, so you're adding a second one. Please try with the following:
<?xml version="1.0" encoding="UTF-8"?> <dds xmlns="http://www.eprosima.com/XMLSchemas/fastRTPS_Profiles"> <profiles> <transport_descriptors> <!-- Create a descriptor for the new transport --> <transport_descriptor> <transport_id>shm_transport</transport_id> <type>SHM</type> <segment_size>10485760</segment_size> </transport_descriptor> </transport_descriptors> <participant profile_name="SHMParticipant" is_default_profile="true"> <rtps> <!-- Link the Transport Layer to the Participant --> <userTransports> <transport_id>shm_transport</transport_id> </userTransports> <useBuiltinTransports>false</useBuiltinTransports> </rtps> </participant> </profiles> </dds>
Ahh, thanks. Seems to work like this - wonder why it didn't work with the previous so it just used UDP?
Ahh. missed the is_default_profile="true"
- seems to work also with UDP using 95MB/s on the loopback interface.
Thanks a lot for the help. Think we can close this, unless default should be something else than a 0.5MB, which seems quite low in a ROS application?
So this can't be configure per topic - since it requires the is_default_profile="true"
to be added?
Is it only the QoS that can be configured per topic?
https://fast-dds.docs.eprosima.com/en/latest/fastdds/ros2/ros2_configure.html#example
Nevermind, guess I can just omit the xml config file for those nodes, that don't require large amount of shared mem.
A question more. Why does both the publisher and subscriber create a shared memory buffer - according to this diagram https://fast-dds.docs.eprosima.com/en/latest/fastdds/transport/shared_memory/shared_memory.html#definition-of-concepts
the shared memory on the subscriber side is not used?
A question more. Why does both the publisher and subscriber create a shared memory buffer - according to this diagram https://fast-dds.docs.eprosima.com/en/latest/fastdds/transport/shared_memory/shared_memory.html#definition-of-concepts
the shared memory on the subscriber side is not used?
On subscriber side, I think you don't need to set the segment size.
Yeah, I just tried to make another config, with defaults and it works well, but still creates the smaller shared memory, but guess some of that is used for discovery?
On a side node, I can't get ros2 topic list to show the topic, if I run it with the custom xml profile. Even if I launch ros2 topic list with same xml file.
Ros2 topic echo doesn't work either.
And the ros2 topic hz work, but shows only 15 Hz, when I published with 50 - but that might be the transition to python - one core is at least maxed, which seems to be the bottleneck.
If I disable all shared memory and run over UDP it works fine - even though the ros2 topic hz still only show about 15 Hz...
Seems to be what I will go for, for now.
Yeah, I just tried to make another config, with defaults and it works well, but still creates the smaller shared memory, but guess some of that is used for discovery?
On a side node, I can't get ros2 topic list to show the topic, if I run it with the custom xml profile. Even if I launch ros2 topic list with same xml file.
Ros2 topic echo doesn't work either.
And the ros2 topic hz work, but shows only 15 Hz, when I published with 50 - but that might be the transition to python - one core is at least maxed, which seems to be the bottleneck.
This is because you'd need to run ros2 daemon stop
before calling to ros2 topic list
again. You probably had a daemon running with default transports, which means discovery over UDP only. In any case, another thing you can do is to set a transport descriptor for a UDP transport into you XML and let participants have both. That way you'd have the same as you would by default, but with larger segments in the SHM transport.
Regarding the reader side segment:
Ah, yeah okay. Works fine now I restarted the docker, but ros2 daemon stop
would probably be fine too!
Thanks for the info.
Think we can close this, unless default should be something else than a 0.5MB, which seems quite low in a ROS application?
@EduPonz do you think this is something we should adjust on rmw_fastrtps
, as far as i know we do not have these kind of setting in rmw_fastrtps
, right? so this can be moved to https://github.com/eProsima/Fast-DDS? i am not sure if we want set or change the default for ROS 2, small or big is really application dependency.
if we are not changing any default, i think we can close this issue.
Think we can close this, unless default should be something else than a 0.5MB, which seems quite low in a ROS application?
@EduPonz do you think this is something we should adjust on
rmw_fastrtps
, as far as i know we do not have these kind of setting inrmw_fastrtps
, right? so this can be moved to https://github.com/eProsima/Fast-DDS? i am not sure if we want set or change the default for ROS 2, small or big is really application dependency.if we are not changing any default, i think we can close this issue.
From my viewpoint, things should work out of the box. Generally, you should be able to send small messages even if you have allocated a "large" shared memory pool, but the other way around leads to packet drop, hence this issue.
The question of how large default should be, is of course a bit difficult to guess, but if you cover most cases, one could look towards large point clouds or 8K resolution images from cameras and set that as a target point - that should probably cover most cases.
The only downside is though, that you can run out of shared memory. In default docker its only 64MB, but you get a nice error message that it could not allocate space, if you run short of it.
In comparison, we have a NUC PC which has about 7GB shared mem and my laptop has 32GB. So a default of 10MB or 20, would only be small subset of those. Double the size, if its not easy to set a lower value for subscribers (see below).
If the package you try to send are larger than the shared memory available, you get no warning / error - just lower Hz / dropped messages.
A question more:
Is it possible to configure one SHM setting for publishers and a second for subscribers? I find it quite unfortunate that I have to prefix all ros commands, with
RMW_FASTRTPS_USE_QOS_FROM_XML=1 FASTRTPS_DEFAULT_PROFILES_FILE=my_config.xml
Think that the QOS_FROM_XML can be omitted in my case, but still.
It would be nice to be able to set it for the entire system, instead of X number of sensor nodes.
Alternative to increasing the default value, could be to parameterize it, so when you create a publisher you can specify the amount of shared memory, which the driver maintainer then can estimate based on the optimal for each of their drivers/sensors?
I would like to add that I have run into this exact issue trying to view (I think) small images in rqt, where anything over 420x420 resolution plays extremely poorly. This happens when rqt can no longer get the entire message through shm, and I guess struggles with the udp method. I absolutely believe that the ros defaults should be changed to have a shm pool large enough for rqt to work on an average webcam.
Just to add some more insight in the configuration options. For large data transmissions we have max_msg_size
and sockets_size
to adjust, among other things, the size of the shm segment sizes.
I think that it would be probably better to have rmw_fastrtps
configuration and setting in https://docs.ros.org/en/rolling/ about these kind of special settings. we already have some information in rmw_fastrtps
repo, e.g https://github.com/ros2/rmw_fastrtps?tab=readme-ov-file#large-data-transfer-over-lossy-network, but that is not where users would check.
Bug report
Required Info:
Steps to reproduce issue
It all stems from transferring pointcloud data from Ousters ROS2 driver to any subscriber - ros2bag/ros2 topic echo/hz etc - these also indicates dropped messages. The minor example here, can reproduce it though. As far as I know all sensors have their publishing QoS set to BEST_EFFORT, so this is also the case in this example.
Expected behavior
Messages get sent and received with the required frequency
Actual behavior
Messages get dropped occasionally, getting worse the higher frequency or package size. See image:![image](https://github.com/ros2/rmw_fastrtps/assets/127200805/29716801-4919-44cc-ac8a-43208299660b)
Additional information
I have searched everywhere to find a solution, but the majority is suggestion to change buffer sizes, but it doesn't seem to be applicable here, since it uses shared memory. As seen on the image, it seemingly only uses around 1.5MB and has up to 64MB available.