input-output-hk / io-sim

Haskell's IO simulator which closely follows core packages (base, async, stm).
https://hackage.haskell.org/package/io-sim
Apache License 2.0
37 stars 15 forks source link

Add `threadStatus` to IOSim and IOSimPOR #19

Closed MaximilianAlgehed closed 2 years ago

MaximilianAlgehed commented 2 years ago

This PR introduces an implementation of threadStatus for IOSim. It's currently a bit fast and loose on the "blocked reason" for threads (e.g. reading MVars etc.) but should implement the rest of the functionality faithfully.

This fixes https://github.com/input-output-hk/io-sim/issues/12.

MaximilianAlgehed commented 2 years ago

@coot do you perhaps have time to review this?

MaximilianAlgehed commented 2 years ago

@coot do you think we should try to address the BlockReason issues in this PR or do we postpone that for later?

coot commented 2 years ago

BlockReason could be addressed later; Please leave a comment which links to an issue.

MaximilianAlgehed commented 2 years ago

Issue mentioning BlockReason https://github.com/input-output-hk/io-sim/issues/21.

MaximilianAlgehed commented 2 years ago

@coot I think this should be about ready now.

MaximilianAlgehed commented 2 years ago

@dcoutts did you check the tests? On my machine the tests pass - indicating that our hypothesis about the difference between synch and asynch exceptions is true at least some of the time. See specifically the tests prop_thread_status_died and prop_thread_status_died_own (which has a bad name I admit). Could you check out the tests and run them on your machine to see if they match?

MaximilianAlgehed commented 2 years ago

And to be clear, we were very surprised by the semantics we observed!

MaximilianAlgehed commented 2 years ago

@coot or @dcoutts do you mind running the CI here so we can get things to turn green? (I like green, it's a pretty colour)

coot commented 2 years ago

I don't see how to make CI run, there's no box which github docs mentions :disappointed:.

MaximilianAlgehed commented 2 years ago

@dcoutts could you perhaps push the right buttons on this so we can get it merged?

coot commented 2 years ago

@dcoutts could you perhaps push the right buttons on this so we can get it merged?

I asked similar question today.

Could you rebase the branch on top of origin/main: even if we run the workflows now they will fail without rebase (I checked that locally).

coot commented 2 years ago

@MaximilianAlgehed I pushed PR #27 with your changes (rebased), unfortunately two tests failed:

thread status died_own (IO):             FAIL
 *** Failed! Falsified (after 1 test):
 ThreadFinished /= ThreadBlocked BlockedOnMVar
 Use --quickcheck-replay=392463 to reproduce.
 Use -p '/thread status died_own (IO)/' to rerun this test only.

thread status mask (IO):                 FAIL
 *** Failed! Falsified (after 1 test):
 ThreadFinished /= ThreadBlocked BlockedOnMVar
 Use --quickcheck-replay=499493 to reproduce.
 Use -p '/thread status mask (IO)/' to rerun this test only.
MaximilianAlgehed commented 2 years ago

@coot that's a bit worrying - they pass locally. Are the tests run in parallel in CI? That might affect scheduling (these tests are understandably a bit sensitive to scheduling).

coot commented 2 years ago

The problem is only on Windows, where I can reproduce it locally too.

coot commented 2 years ago

Since threadStatus is not well specified, I think it's ok if we just disable these tests on Windows.

MaximilianAlgehed commented 2 years ago

@coot That's fine with me. I'll make an issue that mentions this so I have something to come back to after the next spate of deadlines.

MaximilianAlgehed commented 2 years ago

@coot I've made an issue here https://github.com/input-output-hk/io-sim/issues/28 and it's weirder than I thought! I finally got the time to look carefully at this and as I say in the issue the behaviour on Windows is probably just plain wrong! We don't do any operations that should put the thread that we ask threadStatus of in that state at any point in the execution!

coot commented 2 years ago

Thanks, I disabled the tests in the other PR. Once CI will finish I'll merge it.

coot commented 2 years ago

Merged via #27.

dcoutts commented 2 years ago

@MaximilianAlgehed I'm still going to claim that the threadStatus behaviour is not well defined, and I'm sceptical that we should go to great efforts to support it. I don't mind supporting it, but it's not at all clear that there is any sync vs async exception behaviour difference, and it's the tracking of that distinction that makes this patch particularly invasive. Without that, all the info would be available already I think.

So yes I did try running your two tests prop_thread_status_died and prop_thread_status_died_own in IO and yes they behave as you say. However, if one make a trivial modification:

prop_two_threads_expect target main prop = do
  tid <- forkIO target
  yield                    --  <-- insert this extra yield here
  main tid
  status <- threadStatus tid
  return $ prop status

we would expect that this should make no difference to the semantics right?

Yet it does:

*Test.IOSim> quickCheck $ ioProperty prop_thread_status_died_own 
<interactive>: divide by zero
+++ OK, passed 1 test.
*Test.IOSim> quickCheck $ ioProperty prop_thread_status_died
<interactive>: divide by zero
*** Failed! Falsified (after 1 test):  
ThreadDied /= ThreadFinished

Now both examples return ThreadFinished whereas before the throwTo case would return ThreadDied.

This can also be seen in an even simpler standalone test:

main = do

    tid <- forkIO (forever yield)
    throwTo tid DivideByZero
    yield
    print =<< threadStatus tid

    tid' <- forkIO (forever yield)
    throwTo tid' DivideByZero
    yield
    print =<< threadStatus tid'

with the result

$ ./ThreadStatus
ThreadStatus: divide by zero
ThreadFinished
ThreadDied

This example makes it somewhat clearer what is going on: ghc seems to behave differently when sending an async exception if the thread has never yet run, vs when it has already started running. The example above of running the same thing twice relies on the vagaries of scheduling, whereas inserting an explicit yield forces it one way or another.

So overall, it's not clear that (at least when the victim thread is already running) that ghc makes any distinction between async and sync exceptions. It's just that the async case lets one distinguish between victim threads that have not yet started running, and those that are already running, which for some reason ghc seems to report differently.

So I'd argue that we should back out the complexities of the tracking of sync vs async exceptions for the purpose of thread status reporting, and just do a simpler impl of threadStatus where we never use threadDied and all terminated threads (irrespective of whether they terminated by exception or return) report ThreadFinished.

MaximilianAlgehed commented 2 years ago

@dcoutts I agree with your assessment. However, I don't know when I can get to this but I will try to make it my top priority to clean this patch up next week. Does that sound reasonable?

dcoutts commented 2 years ago

I've filed a GHC issue: https://gitlab.haskell.org/ghc/ghc/-/issues/22279

coot commented 2 years ago

@dcoutts I created an issue to track this, let's close this PR as it's already merged via #27.