ros2 / ci

ROS 2 CI Infrastructure
http://ci.ros2.org/
Apache License 2.0
48 stars 30 forks source link

Disable Mimick tests on aarch64 (for now) #777

Closed cottsay closed 2 months ago

cottsay commented 2 months ago

This is the same approach that was used to disable cppcheck on RHEL.

``` $ ./create_jenkins_job.py --select-jobs-regexp .*aarch64.* Connecting to Jenkins 'https://ci.ros2.org' Connected to Jenkins version '2.319.2' Skipped 'ci_linux-aarch64' because the config is the same (dry run) Updating job 'test_ci_linux-aarch64' (dry run) <<< --- remote config +++ new config @@ -178 +178 @@ - --event-handlers console_cohesion+ --retest-until-pass 2 --ctest-args -LE xfail --pytest-args -m "not xfail" + --event-handlers console_cohesion+ --retest-until-pass 2 --ctest-args -LE "(mimick|xfail)" --pytest-args -m "not xfail" >>> Updating job 'ci_packaging_linux-aarch64' (dry run) <<< --- remote config +++ new config @@ -161 +161 @@ - --event-handlers console_cohesion+ --retest-until-pass 2 --ctest-args -LE xfail --pytest-args -m "not xfail" + --event-handlers console_cohesion+ --retest-until-pass 2 --ctest-args -LE "(mimick|xfail)" --pytest-args -m "not xfail" >>> Updating job 'test_packaging_linux-aarch64' (dry run) <<< --- remote config +++ new config @@ -161 +161 @@ - --event-handlers console_cohesion+ --retest-until-pass 2 --ctest-args -LE xfail --pytest-args -m "not xfail" + --event-handlers console_cohesion+ --retest-until-pass 2 --ctest-args -LE "(mimick|xfail)" --pytest-args -m "not xfail" >>> Updating job 'packaging_linux-aarch64' (dry run) <<< --- remote config +++ new config @@ -161 +161 @@ - --event-handlers console_cohesion+ --retest-until-pass 2 --ctest-args -LE xfail --pytest-args -m "not xfail" + --event-handlers console_cohesion+ --retest-until-pass 2 --ctest-args -LE "(mimick|xfail)" --pytest-args -m "not xfail" >>> Updating job 'nightly_linux-aarch64_debug' (dry run) <<< --- remote config +++ new config @@ -178 +178 @@ - --event-handlers console_cohesion+ --retest-until-pass 2 --ctest-args -LE xfail --pytest-args -m "not xfail" + --event-handlers console_cohesion+ --retest-until-pass 2 --ctest-args -LE "(mimick|xfail)" --pytest-args -m "not xfail" >>> Updating job 'nightly_linux-aarch64_release' (dry run) <<< --- remote config +++ new config @@ -178 +178 @@ - --event-handlers console_cohesion+ --retest-until-pass 2 --ctest-args -LE xfail --pytest-args -m "not xfail" + --event-handlers console_cohesion+ --retest-until-pass 2 --ctest-args -LE "(mimick|xfail)" --pytest-args -m "not xfail" >>> Updating job 'nightly_linux-aarch64_repeated' (dry run) <<< --- remote config +++ new config @@ -178 +178 @@ - --event-handlers console_cohesion+ --retest-until-fail 2 --ctest-args -LE "(linter|xfail)" --pytest-args -m "not linter and not xfail" + --event-handlers console_cohesion+ --retest-until-fail 2 --ctest-args -LE "(linter|(mimick|xfail))" --pytest-args -m "not linter and not xfail" >>> Skipped 'nightly_linux-aarch64_xfail' because the config is the same (dry run) ```

Standard aarch64 job: Build Status Repeated aarch64 job: Build Status (to test more complex ctest -LE logic)

The changes to label the tests have been merged into the rolling branches of the relevant packages, but have not been backported to any other distros and have not yet been released. They should be there for nightlies, though.

I'm not 100% certain that this will play nicely with the launcher jobs :shrug: at worst, it simply won't be applied and we'll have to come up with another solution there.

cottsay commented 2 months ago

This change collides with #757, so it doesn't propagate to ci_linux-aarch64 unfortunately. The argument logic here is a bit of a rat nest.

cottsay commented 2 months ago

This change collides with #757, so it doesn't propagate to ci_linux-aarch64 unfortunately.

Alright, I dropped in a workaround for that with 8121875.

New dry run:

``` $ ./create_jenkins_job.py --select-jobs-regexp .*aarch64.* Connecting to Jenkins 'https://ci.ros2.org' Connected to Jenkins version '2.319.2' Updating job 'ci_linux-aarch64' (dry run) <<< --- remote config +++ new config @@ -178 +178 @@ - --event-handlers console_cohesion+ --retest-until-pass 2 --ctest-args -LE xfail --pytest-args -m "not xfail" --executor sequential + --event-handlers console_cohesion+ --retest-until-pass 2 --ctest-args -LE "(mimick|xfail)" --pytest-args -m "not xfail" --executor sequential >>> Updating job 'test_ci_linux-aarch64' (dry run) <<< --- remote config +++ new config @@ -178 +178 @@ - --event-handlers console_cohesion+ --retest-until-pass 2 --ctest-args -LE xfail --pytest-args -m "not xfail" + --event-handlers console_cohesion+ --retest-until-pass 2 --ctest-args -LE "(mimick|xfail)" --pytest-args -m "not xfail" >>> Updating job 'ci_packaging_linux-aarch64' (dry run) <<< --- remote config +++ new config @@ -161 +161 @@ - --event-handlers console_cohesion+ --retest-until-pass 2 --ctest-args -LE xfail --pytest-args -m "not xfail" + --event-handlers console_cohesion+ --retest-until-pass 2 --ctest-args -LE "(mimick|xfail)" --pytest-args -m "not xfail" >>> Updating job 'test_packaging_linux-aarch64' (dry run) <<< --- remote config +++ new config @@ -161 +161 @@ - --event-handlers console_cohesion+ --retest-until-pass 2 --ctest-args -LE xfail --pytest-args -m "not xfail" + --event-handlers console_cohesion+ --retest-until-pass 2 --ctest-args -LE "(mimick|xfail)" --pytest-args -m "not xfail" >>> Updating job 'packaging_linux-aarch64' (dry run) <<< --- remote config +++ new config @@ -161 +161 @@ - --event-handlers console_cohesion+ --retest-until-pass 2 --ctest-args -LE xfail --pytest-args -m "not xfail" + --event-handlers console_cohesion+ --retest-until-pass 2 --ctest-args -LE "(mimick|xfail)" --pytest-args -m "not xfail" >>> Updating job 'nightly_linux-aarch64_debug' (dry run) <<< --- remote config +++ new config @@ -178 +178 @@ - --event-handlers console_cohesion+ --retest-until-pass 2 --ctest-args -LE xfail --pytest-args -m "not xfail" + --event-handlers console_cohesion+ --retest-until-pass 2 --ctest-args -LE "(mimick|xfail)" --pytest-args -m "not xfail" >>> Updating job 'nightly_linux-aarch64_release' (dry run) <<< --- remote config +++ new config @@ -178 +178 @@ - --event-handlers console_cohesion+ --retest-until-pass 2 --ctest-args -LE xfail --pytest-args -m "not xfail" + --event-handlers console_cohesion+ --retest-until-pass 2 --ctest-args -LE "(mimick|xfail)" --pytest-args -m "not xfail" >>> Updating job 'nightly_linux-aarch64_repeated' (dry run) <<< --- remote config +++ new config @@ -178 +178 @@ - --event-handlers console_cohesion+ --retest-until-fail 2 --ctest-args -LE "(linter|xfail)" --pytest-args -m "not linter and not xfail" + --event-handlers console_cohesion+ --retest-until-fail 2 --ctest-args -LE "(linter|(mimick|xfail))" --pytest-args -m "not linter and not xfail" >>> Skipped 'nightly_linux-aarch64_xfail' because the config is the same (dry run) ```
clalancette commented 2 months ago

This change collides with #757, so it doesn't propagate to ci_linux-aarch64 unfortunately. The argument logic here is a bit of a rat nest.

I honestly think we can drop --executor sequential at this point. We aren't seeing any failures due to this in the nightlies anymore. But that should probably be done in a separate PR, so I'm OK with this workaround for now.

cottsay commented 2 months ago

I'm not 100% certain that this will play nicely with the launcher jobs 🤷 at worst, it simply won't be applied and we'll have to come up with another solution there.

Indeed this is the case. The launcher's test arguments override the default for ci_linux-aarch64. However, this should still work correctly for all non-launcher invocations of aarch64 jobs (including nightlies).

We could:

  1. Disable Mimick tests on all jobs started by the launcher, even non-aarch64 jobs
  2. Take a different route entirely and make the tests skip themselves on aarch64
clalancette commented 2 months ago
  1. Disable Mimick tests on all jobs started by the launcher, even non-aarch64 jobs

I don't think we should do this. In particular, I'm fine with disabling the mimick jobs on aarch64 because it is highly unlikely that they will have a different result than amd64. But I do think those tests provide value, so we should keep running them on at least one platform.

2. Take a different route entirely and make the tests skip themselves on aarch64

While I don't love this, I think this may be our only path forward at this time.