In https://github.com/jberryman/unagi-chan I have some quite paranoid and specific tests for properties I care about in atomic-primops. I've come back to the library after a while and find that one of those tests failed on CI (which I just set up).
-- Test these assumptions:
-- 1) If a CAS fails in thread 1 then another CAS (in thread 2, say) succeeded; i.e. no false negatives
-- 2) In the case that thread 1's CAS failed, the ticket returned with (False,tk) will contain that newly-written value from thread 2
testNextistentSuccessFailure :: IO ()
testNextistentSuccessFailure = do
var <- newIORef "0"
sem <- newIORef (0::Int)
outs <- replicateM 2 newEmptyMVar
_ <- forkSync sem 2 $ test "a" var (outs!!0)
_ <- forkSync sem 2 $ test "b" var (outs!!1)
mapM takeMVar outs >>= examine
-- w/r/t (2) above: we only try to find an element read along with False
-- which wasn't sent by another thread, which isn't ideal
where attempts = 100000
test tag var out = do
res <- forM [(1::Int)..attempts] $ \x-> do
let str = (tag++(show x))
tk <- readForCAS var
(b,tk') <- casIORef var tk str
return (if b then str else peekTicket tk' , b)
putMVar out res
examine [res1, res2] = do
-- any failures in either should be marked as successes in the other
let (successes1,failures1) = (\(x,y)-> (Set.fromList $ map fst x, map fst y)) $ partition snd res1
(successes2,failures2) = (\(x,y)-> (Set.fromList $ map fst x, map fst y)) $ partition snd res2
ok1 = all (flip Set.member successes2) failures1
ok2 = all (flip Set.member successes1) failures2
if ok1 && ok2
then when (length failures1 < (attempts `div` 6) || length failures2 < (attempts `div` 6)) $
error "There was not enough contention to trust test. Please retry."
else do print res1
print res2
error "FAILURE!"
examine _ = error "Fix testNextistentSuccessFailure"
The idea is to let two threads race, record what happens, and make sure the history comports with our stated properties in the comment above.
changing attempts to 1,000,000 and running repeatedly until failure I can reliably produce an error in ghc 8.6.3 in a few seconds. Here's an example history from one such run:
(27476,("a27936",False))
(27477,("b27477",True))
(27478,("b27478",True))
(27479,("b27479",True))
(28246,("a28246",True))
(28247,("a28247",True))
(27480,("b27480",True))
(28248,("a28248",True))
(27481,("b27481",True))
? (28249,("a28248",False))
(27482,("b27482",True))
(28250,("a28250",True))
(27483,("a28251",False)) x (28251,("a28251",True))
(28252,("a28252",True))
(27484,("b27484",True)) x (28253,("b27484",False))
(27485,("a28254",False)) x (28254,("a28254",True))
Each column is a separate thread, time moving down, and the tuples mean: (number_payload_attempted, (value_written_if_success_else_val_from_ticket_returned, succeeded?)).
The xs between the columns indicate a couple races (we're trying to generate these). The ? is the problem: it seems to show that with no contention, the right thread performed a CAS which succeeded and then a CAS which failed unexpectedly (according to the returned ticket, the previous value was the one we'd just written).
I did some testing with atomic-primops 0.8.3 and got failures on GHC versions:
8.6.3 (fails in under a minute)
8.2.0.20170507 (first failure took less than a minute, but initially ran for 15 min without failure...)
8.4.3 (failure after 1:30)
GHC 8.0.1 never exhibited the failure, even after running for 2 hours and 300 million test cases. So this may be a regression of some kind.
This may be the same as https://github.com/rrnewton/haskell-lockfree/issues/69 but I decided it would be best to report a new bug since my test case is different.
In https://github.com/jberryman/unagi-chan I have some quite paranoid and specific tests for properties I care about in atomic-primops. I've come back to the library after a while and find that one of those tests failed on CI (which I just set up).
It's here and looks like:
The idea is to let two threads race, record what happens, and make sure the history comports with our stated properties in the comment above.
changing attempts to
1,000,000
and running repeatedly until failure I can reliably produce an error in ghc 8.6.3 in a few seconds. Here's an example history from one such run:Each column is a separate thread, time moving down, and the tuples mean:
(number_payload_attempted, (value_written_if_success_else_val_from_ticket_returned, succeeded?))
.The
x
s between the columns indicate a couple races (we're trying to generate these). The?
is the problem: it seems to show that with no contention, the right thread performed a CAS which succeeded and then a CAS which failed unexpectedly (according to the returned ticket, the previous value was the one we'd just written).I did some testing with atomic-primops 0.8.3 and got failures on GHC versions:
GHC 8.0.1 never exhibited the failure, even after running for 2 hours and 300 million test cases. So this may be a regression of some kind.