Open jmid opened 2 years ago
I've been trying to have some numbers comparing bug-triggering with cpu_relax
and semaphore
. I have some strange results (no buggy programs found over 10000 while CI is happy with 1000...) and I don't understand yet, but it seems that synchronization with a semaphore is a bit faster than with a cpu_relax
:
$ dune exec -- src/neg_tests/conclist_stm_tests.exe
random seed: 138767447
generated error fail pass / total time test name
[✓] 10000 0 0 10000 / 10000 103.1s STM int64 CList with cpu_relax
[✓] 10000 0 0 10000 / 10000 78.5s STM int64 CList with semaphore
================================================================================
success (ran 2 tests)
relax : 0 / 10000
semap : 0 / 10000
Code is here: https://github.com/n-osborne/multicoretests/blob/domain-stats/src/neg_tests/conclist_stm_tests.ml#L53 and here: https://github.com/n-osborne/multicoretests/blob/domain-stats/lib/STM.ml#L391
That's indeed interesting that the Semaphore
is faster than the "Atomic waiting loop" :+1: :thinking:
I had a quick look:
mk_prop
does not increase the counter (I think it should)repeat
. To be comparable to the CI's 1000 iterations I would try to use it here too.* When an exception is raised `mk_prop` does not increase the counter (I think it should)
Yes, that works better that way.
* I also noticed that the stats tests are not using `repeat`. To be comparable to the CI's 1000 iterations I would try to use it here too.
That was just to have something a bit more accurate for speed.
So Semaphore
are indeed faster, but spot far less buggy programs:
This is with repeat 25 prop
.
$ dune exec -- src/neg_tests/conclist_stm_tests.exe
random seed: 300478220
generated error fail pass / total time test name
[✓] 10000 0 0 10000 / 10000 3302.4s STM int64 CList with cpu_relax
[✓] 10000 0 0 10000 / 10000 1970.2s STM int64 CList with semaphore
================================================================================
success (ran 2 tests)
relax : 36868 / 10000
semap : 8 / 10000
Ah, that is indeed quite a difference! :open_mouth:
i'm surprised by the number 36868 though!
Because of the way Util.repeat
is implemented it should stop early on the first failed property.
I would thus expect it to increment the counter at most once for each of the 25 repetitions and hence reach at most 10000. :thinking:
There's one remaining usage of
cpu_relax
in spinning the first domain while waiting for the second domain to start-up: https://github.com/jmid/multicoretests/blob/8a9a2327e06036f06ca5ef4b1321129ccff557d6/lib/lin.ml#L122-L124 Now that we have statistics in place, it would be natural to give thisDomain
setup a run-down to see what aspects actually influence the bug-finding ability similar to what I did forThread
recently: https://github.com/jmid/multicoretests/blob/8a9a2327e06036f06ca5ef4b1321129ccff557d6/src/statistics/README.md?plain=1#L129-L143For
Thread
a wait loop had an significant effect. ForDomain
it would be nice to confirm - and also investigate whether there could be better ways to accomplish this. In the tests for the work-stealing deque that has now been pulled out ofdomainslib
the spinning did not work at all to trigger issues on MacOSX, so I ended up going with a binary semaphore: https://github.com/jmid/multicoretests/blob/8a9a2327e06036f06ca5ef4b1321129ccff557d6/src/domainslib/ws_deque_test.ml#L131-L133 The simpler, the better. A combination of aMutex
and aCondition
variable may also be sufficient.Originally posted by @jmid in https://github.com/jmid/multicoretests/issues/43#issuecomment-1099991569