Open Hemofektik opened 6 months ago
For the PR to proceed, you should revert the whitespace-only changes and force-push a new commit without them.
I will do that thanks.
How did you measure the time? What is the influence on a real running system?
I saw this issue actually using nsight profiler when one of our nodes tried to create around 50 subscribers/publishers which takes forever. I then checked this code piece individually and measured its duration. 35 ms is not much for a single pub/sub but it can add up. So this change could save us 1-2 seconds in some nodes of startup time. This is probably also visible when using nodelets heavily.
Interesting. Would it be possible to write a simple script (pair of scripts?) that would demonstrate this?
Interesting. Would it be possible to write a simple script (pair of scripts?) that would demonstrate this?
You mean as part of the tests? I could try to add a test that is calling this code and measures it.
In my particular case I copied the enoughFreeFDs()
method to my test program to measure it individually before and after.
Probably not as part of unit tests. The buildfarm performance varies too wildly to perform any kind of performance testing that would be able to measure such differences.
I meant something we could use manually when reviewing the PR.
I meant something we could use manually when reviewing the PR.
Here is a hybrid of the original code and the final code in the PR I used for validating this.
fd_test.zip
You can replace the content of enoughFreeFDs()
with the original parts from the ros_comm to see its original performance. You need to change the initFDs()
function then also to resize the vector to the original 8 MB size to capture the 1<<20 FDs.
I've tested the file you sent and indeed the proposed change makes it run much faster with higher ulimit -n
values.
Relevant issues and PRs mentioned that most systems have this limit set to 1024, so the speedup would not be noticeable. But in docker, the limit is 1048576 by default and there it starts to be visible.
I'm still thinking about how to create a real-world (but self-contained and minimal) ROS testcase that would showcase the problem using roscpp. Would you be able to adjust the old example from https://github.com/peci1/ros_infinite_loop/tree/docker to demonstrate the problem?
Would you be able to adjust the old example from https://github.com/peci1/ros_infinite_loop/tree/docker to demonstrate the problem?
Sure, but it would take some days until I have some spare time available.
I'm nevertheless not a maintainer of this repo, so don't take my comments as something required. I just try to help getting this PR in a form that would be easy to grasp for maintainers so that it could eventually get merged. There's not much maintainership effort left for ROS 1, so the official maintainers need this kind of community help.
I've tried running our teaching autonomous exploration pipeline with Trutlebot in Gazebo with this PR. Everything works as I'm used to. So at least practically, it seems be okay!
The polling of all file descriptors can take a lot of time especially on modern OSs where each process can open
1<<20
FDs. This PR addresses this issue and stops polling for FDs once the desired number of available FDs has been found. The heap memory needed for holding the FDs is also much smaller and no longer persistent throughout the lifetime of the XmlRpcServer.TL;DR