Open EduPonz opened 4 years ago
Related to that, we measured latencies with ros2 foxy varying different parameters (payload, number of nodes, dds middleware, frequency...) and submitted a paper. You can find the preprint at arxiv: https://arxiv.org/pdf/2101.02074.pdf
Evaluation code can be found on github.
Thanks, @EduPonz for your comments. I will continue the discussion related to real-time statistics in this thread. I will list different options I'm aware of:
Some topics I would like to discuss:
I'm opening this issue to trigger the discussion on the best methods to analyze, interpret, and compare the performance results obtained by the methods decided in #6 .
Right now,
buildfarm_perf_test
does a very slim and probably inadequate processing of the results, basically showing mean and standard deviation without even looking at the probability distributions, which can lead to wrong assessments in terms of for instance regression detection. Since Apex.AIperformance_test
already produces an "means" entry every second, I thought about benefiting from the central limit theorem and perform Gaussian statistics over those distributions. Mind that having normal distributions of the measurements will ease the comparisons, since that enables statistics such as student T-test, which can asses the significance in the difference between two different experiments. However, I encounter some problems with this approach:The previous made me think that we have to develop a system that can decide on which statistic test to run between experiments that gives the most relevant information. Such a system could then be used for detecting regressions in CI builds, and also as a way to present to end user performance results which interpretation can be used to draw fair and relevant conclusions about the performance of the stack when using different configurations.
Furthermore, the ROS 2 middlewares allow for a great number of configurations, a lot of them having an impact on the performance of the stack. I think it'd be very important to define testing profiles and have results for each of them, so that end users can select the profile from which they will benefit the most.
I would also be very helpful for users and developers to set performance requirements on the different profiles. I my opinion, we are sometimes considering the latency difference to the micro-second, but I really don't think any robotic system minds of such small difference, specially cause the control system would never run at such high frequencies. I think from the users' perspective is not a question of who gives the very best performance, but rather who can meet my requirements. This approach would push development to meet all the requirements in every direction, improving the overall ROS 2 experience.