ros2 / ci

ROS 2 CI Infrastructure
http://ci.ros2.org/
Apache License 2.0
48 stars 30 forks source link

Parallelize testing #723

Closed cottsay closed 6 months ago

cottsay commented 1 year ago

This change switches from sequential test execution to parallel test execution.

This is now possible because the colcon-ros-domain-id-coordinator has been enabled on ci.ros2.org, which assigns a different ROS_DOMAIN_ID value to each colcon task. This has historically been the main reason that we test sequentially due to ROS 2 tests stepping on each other.

Most of the remaining regressions caused by parallel testing are caused by system resource contention upsetting tests which make timing assumptions. Theoretically these tests could fail when simply run on slower systems or while another resource-intensive task is run on the host, so it is probably worth the effort to resolve those issues independent of the parallel testing effort.

To start with, I'd like to deploy this change only to the .*nightly.* jobs over the weekend and revert the change the following weekday to see what new issues shake out. After triage, I'll flip this to "ready" and we can consider merging and deploying it fully.

cottsay commented 7 months ago

Early results from this effort (a recap):

A recent job comparison, which shows a 37.5% reduced build time: