This change switches from sequential test execution to parallel test execution.
This is now possible because the colcon-ros-domain-id-coordinator has been enabled on ci.ros2.org, which assigns a different ROS_DOMAIN_ID value to each colcon task. This has historically been the main reason that we test sequentially due to ROS 2 tests stepping on each other.
Most of the remaining regressions caused by parallel testing are caused by system resource contention upsetting tests which make timing assumptions. Theoretically these tests could fail when simply run on slower systems or while another resource-intensive task is run on the host, so it is probably worth the effort to resolve those issues independent of the parallel testing effort.
To start with, I'd like to deploy this change only to the .*nightly.* jobs over the weekend and revert the change the following weekday to see what new issues shake out. After triage, I'll flip this to "ready" and we can consider merging and deploying it fully.
This change switches from sequential test execution to parallel test execution.
This is now possible because the
colcon-ros-domain-id-coordinator
has been enabled on ci.ros2.org, which assigns a differentROS_DOMAIN_ID
value to each colcon task. This has historically been the main reason that we test sequentially due to ROS 2 tests stepping on each other.Most of the remaining regressions caused by parallel testing are caused by system resource contention upsetting tests which make timing assumptions. Theoretically these tests could fail when simply run on slower systems or while another resource-intensive task is run on the host, so it is probably worth the effort to resolve those issues independent of the parallel testing effort.
To start with, I'd like to deploy this change only to the
.*nightly.*
jobs over the weekend and revert the change the following weekday to see what new issues shake out. After triage, I'll flip this to "ready" and we can consider merging and deploying it fully.