INRIA CI is failing for Mac M1 because of tests/parallel/domain_parallel_spawn_burn_gc_set.ml. I'll explain why by looking at a very similar test that is more convenient to study because it takes less time to run.
Timings for tests/parallel/domain_parallel_spawn_burn.ml:
hardware
OS
5.2
trunk
#12579
Mac x64
macos 14
13.08
13.77
4.80
PC x64
Ubuntu 22.04
13.97
13.40
6.15
Mac M1
macos 14
60.26
52.90
4.33
Mac M2
macos 14
39.99
40.83
4.12
RasPi 5
Debian 12
75.85
210.01
16.87
RasPi 4B
Debian 12
43.33
57.84
25.15
(trunk is at commit eabbb4002a0dabdbcf708f52a1db1d7a11618119)
As you can see, #12579 is a big improvement for this test. The improvement comes from commit 2992d123f7e76252355bb299de670a82911cfcdf which implements some fairness in the runtime.
The problem is that the test assumes fairness in the scheduler, which is not guaranteed by the language. The amount of work done by the test varies according to the decisions made by the scheduler, which explains the wide variations in timings.
To show this, I modified the program to print out the number of times it has requested minor and major GCs.
minor loops / major loops :
hardware
OS
5.2
trunk
#12579
Mac x86
macos 14
356966 / 172352
258478 / 125616
1938 / 726
PC x86
Ubuntu 22.04
171050 / 81554
163590 / 97013
3590 / 1704
Mac M1
macos 14
456924 / 289306
621432 / 390233
1644 / 672
Mac M2
macos 14
769771 / 297987
1179670 / 520992
1903 / 716
RasPi 5
Debian 12 64bit
370472 / 241345
124102 / 83500
2084 / 980
RasPi 4B
Debian 12 64bit
113287 / 72835
227548 / 145320
2022 / 858
Note that these numbers are subject to large random variations.
The test is composed of two parts, one that does a bounded amount of work (allocations and spawning/joining threads) and the other does an unbounded amount of work (allocations and calling Gc.set) and stops when the first one is finished.
To solve this problem I propose to run both parts with no bounds, and stop everything after 3 seconds. This way we don't waste our time waiting for tests to run for a random number of minutes.
The following tests all have the same problem and are fixed by this PR:
INRIA CI is failing for Mac M1 because of
tests/parallel/domain_parallel_spawn_burn_gc_set.ml
. I'll explain why by looking at a very similar test that is more convenient to study because it takes less time to run.Timings for
tests/parallel/domain_parallel_spawn_burn.ml
:(trunk is at commit eabbb4002a0dabdbcf708f52a1db1d7a11618119)
As you can see, #12579 is a big improvement for this test. The improvement comes from commit 2992d123f7e76252355bb299de670a82911cfcdf which implements some fairness in the runtime.
The problem is that the test assumes fairness in the scheduler, which is not guaranteed by the language. The amount of work done by the test varies according to the decisions made by the scheduler, which explains the wide variations in timings.
To show this, I modified the program to print out the number of times it has requested minor and major GCs.
minor loops / major loops :
Note that these numbers are subject to large random variations.
The test is composed of two parts, one that does a bounded amount of work (allocations and spawning/joining threads) and the other does an unbounded amount of work (allocations and calling
Gc.set
) and stops when the first one is finished.To solve this problem I propose to run both parts with no bounds, and stop everything after 3 seconds. This way we don't waste our time waiting for tests to run for a random number of minutes.
The following tests all have the same problem and are fixed by this PR:
tests/parallel/domain_parallel_spawn_burn.ml
tests/parallel/domain_parallel_spawn_burn_gc_set.ml
tests/parallel/domain_serial_spawn_burn.ml