barrucadu / dejafu

Systematic concurrency testing meets Haskell.
https://dejafu.docs.barrucadu.co.uk/
191 stars 18 forks source link

Testcase with `STM` and `Async` not failing as expected #383

Open NicolasT opened 1 year ago

NicolasT commented 1 year ago

While trying some things using DejaFu, I ran into a case where a test-case gives (intentionally) inconsistent results, as witnessed when running the test in IO, however, according to testAuto its results are consistent. Only when I add some extra code to align the stars, the inconsistent results are also reported by DejaFu.

Am I doing something wrong, or is there a bug somewhere?

The case boils down to the following:

runComputation :: (MonadConc m) => m a -> (a -> m ()) -> (a -> m ()) -> m Int
runComputation mkLock unLock waitLock = do
  v <- newTVarConc 0

  l <- mkLock

  let act n = do
        waitLock l
        atomically $ writeTVar v n

  withAsync (act 1) $ \a ->
    withAsync (act 2) $ \b -> do
      unLock l
      wait a
      wait b
      readTVarConc v

I.e., run two transactions atomically over some TVar within two distinct threads, each putting another value in the TVar, wait for the threads to finish, and read the value from the TVar.

First, assume mkLock, unLock and waitLock are pure (), pure and pure no-ops. when running this in IO, either 1 or 2 is returned, which seems correct. Indeed, as a unit-test, running this 1000 times has a fairly high probability of distinct values being returned (using the threaded RTS and -N2). When testing with DejaFu, however, no deadlocks or exceptions are detected (good!), but the Consistent Result test succeeds as well, which is unexpected.

If instead we use newEmptyMVar, flip putMvar () and readMVar as mkLock, unLock and waitLock, then running the computation in IO multiple times still yields inconsistent results, and now DejaFu also detects this:

dejafu-test
  Without MVar
    IO:                  FAIL
      Main.hs:26:
      expected: 1
       but got: 2
      Use -p '/Without MVar.IO/' to rerun this test only.
    DejaFu
      Never Deadlocks:   OK
      No Exceptions:     OK
      Consistent Result: OK
  With MVar
    IO:                  FAIL
      Main.hs:35:
      expected: 1
       but got: 2
      Use -p '/With MVar.IO/' to rerun this test only.
    DejaFu
      Never Deadlocks:   OK (0.03s)
      No Exceptions:     OK (0.03s)
      Consistent Result: FAIL (0.07s)
        Failed:
                2 S0----------S1----------S0--S2----------S0---------------

                1 S0----------S1-P2----------S1---------S0----------------

        Use -p '/With MVar.DejaFu.Consistent Result/' to rerun this test only.

3 out of 8 tests failed (0.10s)

I created a Gist with the full test-suite (and Cabal file etc) to share the complete code, and make it easier to reproduce: https://gist.github.com/NicolasT/d2f456112221679ecac2680685627027

Maybe I'm misunderstanding what DejaFu is supposed to do here, and it's acting as expected, but it's non-obvious to me :smiley:

barrucadu commented 1 year ago

Hmm, that's definitely weird. I've managed to reduce it to:

runComputation5 :: (MonadConc m) => m Int
runComputation5 = do
    v <- atomically $ newTVar 0

    -- using an MVar for `blockA` and `blockB` fixes the problem
    blockA <- atomically newEmptyTMVar
    blockB <- atomically newEmptyTMVar

    mask $ \restore -> fork $ act restore v blockA 1
    mask $ \restore -> fork $ act restore v blockB 2

    -- swapping these two lines, or removing either of them, fixes the problem
    atomically $ readTMVar blockA
    atomically $ readTMVar blockB

    atomically $ readTVar v
  where
    act restore var block n = do
      -- removing the `try` fixes the problem
      -- removing the `restore` fixes the problem
      e <- try (restore (atomically $ writeTVar var n))
      atomically $ putTMVar block (e :: Either SomeException ())

So it seems to be some interaction between STM and exceptions (even though no exceptions are thrown in this code)

NicolasT commented 1 year ago

I wanted to reduce my test-case to something "smaller" but didn't get to it yet. Indeed, as your test shows, replacing async by what it basically does behind the scenes also exposes the problem.

Glad to hear this is not caused by a misunderstanding of DejaFu on my end :smile: