Closed mauropasse closed 5 years ago
Thanks for the report, I will investigate.
@mauropasse, I suspect what is happening is that the DPS node thread is getting starved and the callbacks to free the allocated publish requests are backing up. The follow up question to answer is why your log shows the publisher is using up half the CPU. I'm continuing to investigate...
@mauropasse, https://github.com/ros2/rmw_dps/pull/35 should address the memory growth. Please let me know.
Thanks for solving it so fast @malsbat! It works perfect now. Tested in x86_64 and RPi and no more memory issues due high rate publishing. Closing the issue here.
@malsbat, if the node is configured in rmw_node.cpp
to only receive multicast messages the app crashes:
ret = DPS_StartNode(node_impl->node_, DPS_MCAST_PUB_ENABLE_RECV, 0);
./subscriber_lambda
terminate called after throwing an instance of 'rclcpp::exceptions::RCLError'
what(): failed to publish message: error not set, at /root/ros2_ws/src/ros2/rcl/rcl/src/rcl/publisher.c:257
Aborted (core dumped)
@mauropasse, I noticed this late yesterday as well doing some experiments on the CPU usage of multicast. The issue is in DPS, the publish callback reports an error if there were no destinations to publish to, when it should only report failure if it there were destinations and no publish succeeded.
I'm going to merge the PR since it addresses the memory issue.
Ok, thanks @malsbat
@mauropasse, a fix for the failing publish is in progress at https://github.com/intel/dps-for-iot/pull/110. Travis seems a bit slow today, I am waiting for the CI builds to pass and then I will merge it.
That's great @malsbat! I'll test it soon!
https://github.com/intel/dps-for-iot/pull/110 has been merged
I've noticed that when publishing big messages at high rates, there's a constant increase in RSS (and virtual) memory, even if the subscriber gets all the messages.
To reproduce the situation, clone the ros2
examples
repo and apply patch: patch.logThis patch makes the
publisher_lambda
publish high frequency messages of size 64,5 Kb (close to the max UDP datagram size, max message possible to send by rmw_dps) and thesubscriber_lambda
prints the messages and checks if any is lost (subscriber_lambda
sholud be run beforepublisher_lambda
to properly detect lost messages)The output of top in terminal 2 shows an increase in memory:
In terminal 1,
subscriber_lambda
shows all messages it got from the publisher. None is lost.When running this test on the RPi2, I still can see the memory increasing using publish frequencies from 62Hz.
Any ideas why this could be happening?