Sydtest exception handling: crash / lock / ...

The following test suite:

  describe "behavior with exceptions" $ do
      beforeAll (fail "error in beforeAll") $ do
        it "should not kill the test suit" $ do
          1 `shouldBe` 1

Will crash the complete test suite:

sydtest-test: user error (error in beforeAll)

Replace beforeAll by before and you'll get:

Tests:

Test.Syd.AroundSpec
  behavior with exceptions
    before_
      ✗ should not kill the test suit                                      0.04 ms
        Retries: 3 (does not look flaky)

Failures:

    test/Test/Syd/AroundSpec.hs:139
  ✗ 1 Test.Syd.AroundSpec.behavior with exceptions.before_.should not kill the test suit
      Retries: 3 (does not look flaky)
      user error (error in beforeAll)

  Examples:                     3
  Passed:                       0
  Failed:                       1
  Sum of test runtimes:         0.00 seconds
  Test suite took:              0.00 seconds

Using the following changes:

diff --git a/sydtest/src/Test/Syd/Runner/Asynchronous.hs b/sydtest/src/Test/Syd/Runner/Asynchronous.hs
index 413c354..4a9c436 100644
--- a/sydtest/src/Test/Syd/Runner/Asynchronous.hs
+++ b/sydtest/src/Test/Syd/Runner/Asynchronous.hs
@@ -37,6 +37,7 @@ import Test.Syd.Runner.Single
 import Test.Syd.SpecDef
 import Test.Syd.SpecForest
 import Text.Colour
+import Debug.Trace (traceShow)

 runSpecForestAsynchronously :: Settings -> Word -> TestForest '[] () -> IO ResultForest
 runSpecForestAsynchronously settings nbThreads testForest = do
@@ -236,15 +237,21 @@ runner settings nbThreads failFastVar handleForest = do
           DefSetupNode func sdf -> do
             liftIO func
             goForest sdf
-          DefBeforeAllNode func sdf -> do
-            b <- liftIO func
+          DefBeforeAllNode func sdf -> traceShow "Asynchronous" $ do
+            bM <- liftIO $ try $ func
+            let b = case bM of 
+                       Right b -> b
+                       Left (e :: SomeException) -> error (show e)
             withReaderT
               (\e -> e {eExternalResources = HCons b (eExternalResources e)})
               (goForest sdf)
           DefBeforeAllWithNode func sdf -> do
             e <- ask
             let HCons x _ = eExternalResources e
-            b <- liftIO $ func x
+            bM <- liftIO $ try $ func x
+            let b = case bM of 
+                       Right b -> b
+                       Left (e :: SomeException) -> error (show e)
             liftIO $
               runReaderT
                 (goForest sdf)

The following test suites now pass:

  describe "behavior with exceptions" $ do
    describe "beforeAll" $ do
      beforeAll (fail "error in beforeAll") $ do
        it "should not kill the test suit" $ do
          1 `shouldBe` 1
      beforeAll (fail "error in beforeAll") $ do
        itWithOuter "should not kill the test suit" $ \outer ->
          outer `shouldBe` 1
        beforeAllWith (\x -> fail "error in beforeAllWith") $ do
          itWithOuter "should not kill the test suit" $ \outer ->
             outer `shouldBe` 1

With the following result:

Test.Syd.AroundSpec
  behavior with exceptions
    beforeAll
      ✓ should not kill the test suit                                      0.00 ms
      ✗ should not kill the test suit                                      0.02 ms
        Retries: 3 (does not look flaky)
      ✗ should not kill the test suit                                      0.00 ms
        Retries: 3 (does not look flaky)

Failures:

    test/Test/Syd/AroundSpec.hs:142
  ✗ 1 Test.Syd.AroundSpec.behavior with exceptions.beforeAll.should not kill the test suit
      Retries: 3 (does not look flaky)
      user error (error in beforeAll)
      CallStack (from HasCallStack):
        error, called at src/Test/Syd/Runner/Asynchronous.hs:244:53 in sydtest-0.15.1.1-inplace:Test.Syd.Runner.Asynchronous

    test/Test/Syd/AroundSpec.hs:145
  ✗ 2 Test.Syd.AroundSpec.behavior with exceptions.beforeAll.should not kill the test suit
      Retries: 3 (does not look flaky)
      user error (error in beforeAllWith)
      CallStack (from HasCallStack):
        error, called at src/Test/Syd/Runner/Asynchronous.hs:254:53 in sydtest-0.15.1.1-inplace:Test.Syd.Runner.Asynchronous

  Examples:                     7
  Passed:                       1
  Failed:                       2
  Sum of test runtimes:         0.00 seconds
  Test suite took:              0.00 seconds

A few notes:

My understanding of the behavior between before and beforeAll is that the code for before is executed during the it execution, when the code for beforeAll is executed during the tree walking.
I've found a few places where similar changes shoud have to be implemented:

    modified:   src/Test/Syd/Runner/Asynchronous.hs
    modified:   src/Test/Syd/Runner/Synchronous/Interleaved.hs
    modified:   src/Test/Syd/Runner/Synchronous/Separate.hs

The idea of the change is to sneak the exception lazyly in the ressource. Another option could be to, in case of exception, just walk the tree and report failures.
Async exception should be handled instead of SomeException for proper excption handling.

So my question is:

Are you interested by an MR doing this change on the three runners
Or is this completly stupid and I miss the obvious point.

Thank you.

Wow thank you for the detailed report!

So my question is:

Are you interested by an MR doing this change on the three runners

Or is this completly stupid and I miss the obvious point.

I think you got that exactly right. I haven't used modifiers that crash much so I hadn't run into this.

Yes, PR please.

Additionally, I observed the following behaviors related to exceptions:

If a pure exception is hidden inside a value compared with shouldBe (or equivalent), it will crash the test report. (e.g. shouldBe (error "cheval") 10 fails during the test, and when sydtest displays the final report, it crashs when it tries to display the diff between 10 and the error "cheval". (Note: now that I'm reporting that, I'm having issues reproducing it, I'll try to find a proper repro).
sydtest seems to ignore most async exception. Especially, ctrl-C during a test suite is not an instant feedbacks and instead it can retry canceled tests and then, finally, stops.

E.g. the folliwng testsuite:

import Test.Syd
import Control.Concurrent
import Control.Exception

main = sydTest $ do
    it "is slow" $ do
      print "is slow"
      e <- try $ threadDelay 10_000_000
      print "end of delay"
      print (e :: Either SomeException ())
      case e of
        Left e -> throwIO e
        Right () -> pure ()
      pure ()

The test is supposed to wait for 10s. If C-c is sent once, it just restart the test:

./SydException +RTS -N         
Tests:"is slow"

^C"end of delay"
Left AsyncCancelled
"is slow"
"end of delay"
Right ()

However, if ctrl-c is sent twice, the runtime terminates the program without taking any exception handler into account:

./SydException +RTS -N
"is slow"
Tests:

^C"end of delay"
Left AsyncCancelled
"is slow"
^C

(That's a documented GHC rts behavior, but I'm unable to find the documentation).

I suppose that sydtest is catching ALL possible exception and retries. From my PoV, it should ignore the asyrc ones and terminate the test runner immediatly.

Note that it is surprising that if the test suite survives one retry, it does not display the final results:

./SydException +RTS -N         
"is slow"
Tests:

^C"end of delay"
Left AsyncCancelled
"is slow"
<end of command, no feedback>

Wow cool finds! The retry logic was added after the async exception handling so I'm not surprised it doesn't work well together.

I found the problem with shouldBe.

import Test.Syd

data Cheval = Cheval String deriving (Eq)

instance Show Cheval where
  show x = error "NO"

main = sydTest $ do
  it "fails" $ do
    Cheval "troll" `shouldBe` Cheval "troll2"

Here, the shouldBe will fail because values are different, but the Show instance will raises an exception and crash the testsuite during the final report:

./SydException +RTS -N         
Tests:

✗ fails                                                                    0.13 ms
  Retries: 3 (does not look flaky)

Failures:

    SydException.hs:13
  ✗ 1 fails
      Retries: 3 (does not look flaky)
SydException: NO
CallStack (from HasCallStack):
  error, called at SydException.hs:10:12 in main:Main

(No final report).

So, my plans in a MR about exceptions will be to:

a) Add an exception handler when sydtest displays the result of the tests and uses values from the "user side", which may contain hidden exceptions b) Add lazyness on before so exception will be hidden inside the resource and will be raised during usage in the different tests b') Maybe investigate another implementation which just disable the complete subtree of test if a before fails. Maybe cleaner than storing exception in a pure value. c) check the different exception handler (especially the ones which recover, such as try) for correct async behavior.

I've wrote the following test suite:

{-# LANGUAGE NumericUnderscores #-}
module Test.Syd.ExceptionSpec where

import Test.Syd
import Test.Syd.OptParse (defaultSettings)
import Data.Time.Clock
import Control.Concurrent (threadDelay)
import Control.Concurrent.Async

runAndCheckExceptions spec = do
  resultForest <- sydTestResult defaultSettings spec
  pure resultForest

data Cheval = Cheval String deriving (Eq)

instance Show Cheval where
  show x = error "NO"

spec = describe "exception handling" $ do
  it "works with exception in before" $ do
    res <- runAndCheckExceptions $ describe "behavior with exceptions" $ do
      -- This error in before all should appears as an error in the final
      -- report and not a crash of the test suite.
      beforeAll (fail "error in before all") $ do
        it "should not kill the test suit" $ do
          1 `shouldBe` 1

    pure ()
  it "works with exception in shouldBe" $ do
    res <- runAndCheckExceptions $ do
      describe "behavior with exceptions" $ do
        it "should not kill the test suit" $ do
          -- This should be won't raise the exception during comparison
          -- (because `==` is fine). However, the `Show` instance for `Cheval`
          -- is broken.
          --
          -- sydtest will try to display the actual/expected diff using the
          -- show instance, and should not crash.
          Cheval "troll" `shouldBe` Cheval "troll2"

    pure ()

  it "stops immediatly with async" $ do
    -- Tests that when an async exception is sent to sydTest (for example,
    -- ctrl-c), it behaves as expected and terminates immediately and does not
    -- do anything surprising.
    startTime <- liftIO $ getCurrentTime

    -- Runs two threads, one will be done in 1s, and the other is a test suite
    -- with one test which should run for 10s.
    -- When the first threads terminate, it will throw AsyncCancel in the
    -- sydTest test suite, which should terminate asap.
    -- This will be checked using the timer t. That's fragile, but we know that:
    --
    -- t should be more than 1s (because of the first threadDelay)
    -- t should be no much more than 1s. Especially, it should not be 10s
    -- (waiting for the complete threadDelay in the test suite) or even more if
    -- the exception is completly ignored (or test is retried)
    race (threadDelay 1_000_000) $ do
      sydTest $ do
          it "is slow" $ do
            threadDelay 10_000_000

    endTime <- liftIO $ getCurrentTime

    liftIO $ print $ startTime `diffUTCTime` endTime
    pure ()

Which shows the problems and should hopefully be fine once the problems are fixed.

Note that I also observed that:

async exception could also results in sydtest-test: thread blocked indefinitely in an MVar operation errors. My understanding is as follows: when an async exception is raised in the main thread, it stops the concurrencly call in Test.Syd.Runner.Asychronous, actually, concurrently raises the exception in both sub threads: the test runner and the print loop. The printLoop will simply die, and the exception will be rethrown in withJobQueueWorker which will be rethrown in the N workers. However, a worker is possibly currently running runNow on the test, and the async exception is stored in the result, so the test is retried. If a second ctrl-c is raised, the RTS kills everything. If not, the test will eventually terminate and the main runner should also terminate (but way slower, because it does only have a few remaining runners ;), and the test status won't appear anymore.

However, I'm surprised, because tests are run in applyWrapper2, which uses catches with the following handlers:

exceptionHandlers :: [Handler (Either SomeException a)]
exceptionHandlers =
  [ -- Re-throw AsyncException, otherwise execution will not terminate on SIGINT (ctrl-c).
    Handler (\e -> throwIO (e :: AsyncException)),
    -- Catch all the rest
    Handler (\e -> return $ Left (e :: SomeException))
  ]

So as a first observation, it seems that the problem was already anticipated and taken care of. I need to read the precise spec of catches now to understand what is happening.

I observed segmentation fault when designing the second test in the list (the one where Show raises an exception).

@NorfairKing

AsyncException is a class of exception which does NOT contain all asynchroneous exception. We need to use SomeAsyncException (or a wrapper library, such as https://hackage.haskell.org/package/safe-exceptions, which takes care of that for us)

This fix is trivial, AsyncException -> SomeAsyncException. MR pending.

@guibou I'm impressed that you're run into so many of these issues already. I'm open to any improvements, especially when you have such a nice test suite along with them.

NorfairKing / sydtest

Sydtest exception handling: crash / lock / ... #80