ocaml-multicore / multicoretests

PBT testsuite and libraries for testing multicore OCaml
https://ocaml-multicore.github.io/multicoretests/
BSD 2-Clause "Simplified" License
37 stars 16 forks source link

Add explicit macOS Intel+ARM64 runners #458

Closed jmid closed 3 months ago

jmid commented 4 months ago

For quite some time, GitHub actions only supported macOS on Intel hardware. That has changed recently: https://docs.github.com/en/actions/using-github-hosted-runners/about-github-hosted-runners/about-github-hosted-runners#supported-runners-and-hardware-resources

Interestingly, we have used macos-latest which then changed hardware underneath our feet. I noticed, because the once-per-week 5.0.0 and 5.1.1 workflows started finding an STM Sys parallel counterexample relatively consistently. (details below)

This PR thus adds explicit macOS Intel (macos-13) and ARM64 (macos-14) workflows, thus collecting them both in the same GitHub actions CI "backend". I'll write a separate PR to remove the macOS ARM64 runners from multicoretests-ci.

Examples of the two latest runs:

https://github.com/ocaml-multicore/multicoretests/actions/runs/9048073196/job/24860738931

version: 5.0.0
architecture: arm64
system: macosx

random seed: 139266489
generated error fail pass / total     time test name

[ ]    0    0    0    0 / 1000     0.0s STM Sys test sequential
[ ]    0    0    0    0 / 1000     0.0s STM Sys test sequential (generating)
[✓] 1000    0    0 1000 / 1000     4.6s STM Sys test sequential

[ ]    0    0    0    0 /  200     0.0s STM Sys test parallel
[✗]  143    0    1  142 /  200    42.2s STM Sys test parallel

--- Failure --------------------------------------------------------------------

Test STM Sys test parallel failed (23 shrink steps):

                             |           
                     Mkdir ([], "ddd")   
                             |           
                  .---------------------.
                  |                     |           
          Rmdir ([], "ddd")     Rmdir ([], "ddd")   

https://github.com/ocaml-multicore/multicoretests/actions/runs/9048277550/job/24861161644

version: 5.1.1
architecture: arm64
system: macosx

random seed: 222784250
generated error fail pass / total     time test name

[ ]    0    0    0    0 / 1000     0.0s STM Sys test sequential
[ ]    0    0    0    0 / 1000     0.0s STM Sys test sequential (generating)
[✓] 1000    0    0 1000 / 1000     3.6s STM Sys test sequential

[ ]    0    0    0    0 /  200     0.0s STM Sys test parallel
[✗]   44    0    1   43 /  200    31.8s STM Sys test parallel

--- Failure --------------------------------------------------------------------

Test STM Sys test parallel failed (26 shrink steps):

                             |           
                     Mkdir ([], "ddd")   
                             |           
                  .---------------------.
                  |                     |           
          Rmdir ([], "ddd")     Rmdir ([], "ddd")
jmid commented 4 months ago

CI was failing due to

but is otherwise green.

I've added fixes for these in f75289c and 4fda9c2

jmid commented 4 months ago

This should fix #359 - a source of occasional false alarms.

jmid commented 3 months ago

CI summary

In comparison, this PR's macOS-ARM64 5.2.0~beta2 and trunk workflows triggered one after 17 and 122 attempts, respectively.

Out of 38 workflows 3 failed with 1 genuine issue and 2 borderline false alarms / CI issues.

jmid commented 3 months ago

In an attempt to trigger an error on the in-house silicon too, adff14c bumps the count to 2500 as already done in #304. As this is a negative test, this should only matter to macOS ARM64 on multicoretests-ci that have been hitting the 1000 count. I can however see that this does not make a difference :shrug:

Reminding myself of the macOS OCurrent setup from https://tarides.com/blog/2023-08-02-obuilder-on-macos/ I suspect the test is witnessing the difference between ZFS vs whatever the GitHub actions runners are using underneath. Ideally the test suite should hold up against changes of the underlying file system. However

As such, collecting the macOS runners under the same CI system seems reasonable.

jmid commented 3 months ago

CI summary:

In comparison, this PR's macOS-ARM64 5.2.0~beta2 and trunk workflows triggered a counterexample after 22 and 54 attempts, respectively.

The latter two failures will disappear with the merge and deployment of https://github.com/ocurrent/multicoretests-ci/pull/36

Out of 38 workflows 3 failed with 1 genuine issue and 2 borderline false alarms / CI issues.

jmid commented 3 months ago

Merging this one as ocurrent/multicoretests-ci#36 has now been merged

jmid commented 3 months ago

CI summary for merge to main:

None of the 37 workflows failed :tada: