barrucadu / dejafu

Systematic concurrency testing meets Haskell.
https://dejafu.docs.barrucadu.co.uk/
191 stars 18 forks source link

Exception: (dejafu) trace exhausted without reading a to-do point! #323

Open ssadler opened 4 years ago

ssadler commented 4 years ago

I'm just taking my first steps with this library and I ran into a few things. The file linked below reproduces the errors as I'll explain. I'm just making one ticket because I'm not yet sure if there may be an error in dejafu or if I have a wrong assumption somewhere! Tried with versions 2.1.0.1 and 2.3.0.0 (btw the 2.3.0.0 tag isnt on github).

https://gist.github.com/ssadler/77c36e9f99f350c352971a5f4b818b18

  1. If you run the test program testInboundConnectionLimit it produces this error. I don't know what it means but it originates in the library hense this ticket.
Exception: (dejafu) trace exhausted without reading a to-do point!
CallStack (from HasCallStack):
  fatal, called at ./Test/DejaFu/SCT/Internal/DPOR.hs:186:19 in dejafu-2.1.0.1-BAWoaUiZaXH56LDkcqY2da:Test.DejaFu.SCT.Internal.DPOR
  incorporateTrace, called at ./Test/DejaFu/SCT.hs:137:25 in dejafu-2.1.0.1-BAWoaUiZaXH56LDkcqY2da:Test.DejaFu.SCT
  1. I can't seem to define testInboundConnectionLimit without importing ModelTVar from Test.DejaFu.Conc.Internal.STM. Maybe this should be re-exported by Test.DejaFu?

  2. If you replace throwM TooManyThreads with the error on line 85, you get the error nicely printed in the terminal. This could be the expected behaviour, currently I'm trying to figure out why that invariant is triggered though. If you switch the order of 71 and 72 so that threadDelay comes first, the test passes, and I don't understand why the threadDelay has any effect at all, since I figure the test would be exhaustive even without tryng to trigger a race condition. But I could certainly have one or more wrong assumptions.

Thanks

barrucadu commented 4 years ago

Thanks for opening an issue, it looks like you've found a bug.

trace exhausted without reading a to-do point!

This means that dejafu has got into a loop somehow, and is trying scheduling decisions which it has already examined. This shouldn't happen. And it seems to be something to do with invariants, because if I comment out the registerInvariant call, it doesn't happen.

I'm trying to figure out why that invariant is triggered though

I don't think the threadDelay should have any effect (dejafu should just treat it as a no-op). I suspect it's probably the same underlying problem as the invariant issue. Something about a failing invariant is getting dejafu confused.

But, in any case, you're not limiting the number of threads running concurrently at the moment. Both of your asyncs can run at the same time. That's why the invariant can fail.

barrucadu commented 4 years ago

I've managed to track it down to a smaller example:

import Test.DejaFu
import Test.DejaFu.Conc.Internal.STM

import Control.Concurrent.Classy hiding (wait)
import Control.Concurrent.Classy.Async

testInboundConnectionLimit :: Program (WithSetup (ModelTVar IO Int)) IO Int
testInboundConnectionLimit = withSetup setup $ \tvar -> do
    a <- async (act tvar)
    b <- async (act tvar)
    _ <- waitCatch a
    _ <- waitCatch b
    atomically $ readTVar tvar

  where
    setup = atomically $ newTVar 0

    act tvar = do
      atomically $ modifyTVar tvar (+1)
      threadDelay 1
      atomically $ modifyTVar tvar (subtract 1)

Gives:

> resultsSet defaultWay defaultMemType testInboundConnectionLimit
fromList [Right 0,Right 1]

Though the threadDelay is still needed. I'm confused about that.

ssadler commented 4 years ago

Great! I'm glad there actually is an issue and I wasn't just overlooking something.

But, in any case, you're not limiting the number of threads running concurrently at the moment.

I still can't get my head around that though. This may be unrelated to the issue, but, in my example file I'm calling cancel on line 37, which synchronously cancels the thread. With 2 threads, I've verified that cancel is being called (the second thread interrupting the first). This works as expected without the threadDelay on line 72, ie the invariant is never triggered and the tests pass.

Edit: Actually it doesn't work quite as expected. Without the threadDelay, if you traceShowM the contents of tvar, its always 1 (I would expect it to sometimes be 2, in your example), but with the threadDelay, it gets all the way up to 8?