NorfairKing / sydtest

A modern testing framework for Haskell with good defaults and advanced testing features.
113 stars 25 forks source link

Sydtest exception handling: crash / lock / ... #80

Open guibou opened 1 month ago

guibou commented 1 month ago

The following test suite:

  describe "behavior with exceptions" $ do
      beforeAll (fail "error in beforeAll") $ do
        it "should not kill the test suit" $ do
          1 `shouldBe` 1

Will crash the complete test suite:

sydtest-test: user error (error in beforeAll)

Replace beforeAll by before and you'll get:

Tests:

Test.Syd.AroundSpec
  behavior with exceptions
    before_
      ✗ should not kill the test suit                                      0.04 ms
        Retries: 3 (does not look flaky)

Failures:

    test/Test/Syd/AroundSpec.hs:139
  ✗ 1 Test.Syd.AroundSpec.behavior with exceptions.before_.should not kill the test suit
      Retries: 3 (does not look flaky)
      user error (error in beforeAll)

  Examples:                     3
  Passed:                       0
  Failed:                       1
  Sum of test runtimes:         0.00 seconds
  Test suite took:              0.00 seconds

Using the following changes:

diff --git a/sydtest/src/Test/Syd/Runner/Asynchronous.hs b/sydtest/src/Test/Syd/Runner/Asynchronous.hs
index 413c354..4a9c436 100644
--- a/sydtest/src/Test/Syd/Runner/Asynchronous.hs
+++ b/sydtest/src/Test/Syd/Runner/Asynchronous.hs
@@ -37,6 +37,7 @@ import Test.Syd.Runner.Single
 import Test.Syd.SpecDef
 import Test.Syd.SpecForest
 import Text.Colour
+import Debug.Trace (traceShow)

 runSpecForestAsynchronously :: Settings -> Word -> TestForest '[] () -> IO ResultForest
 runSpecForestAsynchronously settings nbThreads testForest = do
@@ -236,15 +237,21 @@ runner settings nbThreads failFastVar handleForest = do
           DefSetupNode func sdf -> do
             liftIO func
             goForest sdf
-          DefBeforeAllNode func sdf -> do
-            b <- liftIO func
+          DefBeforeAllNode func sdf -> traceShow "Asynchronous" $ do
+            bM <- liftIO $ try $ func
+            let b = case bM of 
+                       Right b -> b
+                       Left (e :: SomeException) -> error (show e)
             withReaderT
               (\e -> e {eExternalResources = HCons b (eExternalResources e)})
               (goForest sdf)
           DefBeforeAllWithNode func sdf -> do
             e <- ask
             let HCons x _ = eExternalResources e
-            b <- liftIO $ func x
+            bM <- liftIO $ try $ func x
+            let b = case bM of 
+                       Right b -> b
+                       Left (e :: SomeException) -> error (show e)
             liftIO $
               runReaderT
                 (goForest sdf)

The following test suites now pass:

  describe "behavior with exceptions" $ do
    describe "beforeAll" $ do
      beforeAll (fail "error in beforeAll") $ do
        it "should not kill the test suit" $ do
          1 `shouldBe` 1
      beforeAll (fail "error in beforeAll") $ do
        itWithOuter "should not kill the test suit" $ \outer ->
          outer `shouldBe` 1
        beforeAllWith (\x -> fail "error in beforeAllWith") $ do
          itWithOuter "should not kill the test suit" $ \outer ->
             outer `shouldBe` 1

With the following result:

Test.Syd.AroundSpec
  behavior with exceptions
    beforeAll
      ✓ should not kill the test suit                                      0.00 ms
      ✗ should not kill the test suit                                      0.02 ms
        Retries: 3 (does not look flaky)
      ✗ should not kill the test suit                                      0.00 ms
        Retries: 3 (does not look flaky)

Failures:

    test/Test/Syd/AroundSpec.hs:142
  ✗ 1 Test.Syd.AroundSpec.behavior with exceptions.beforeAll.should not kill the test suit
      Retries: 3 (does not look flaky)
      user error (error in beforeAll)
      CallStack (from HasCallStack):
        error, called at src/Test/Syd/Runner/Asynchronous.hs:244:53 in sydtest-0.15.1.1-inplace:Test.Syd.Runner.Asynchronous

    test/Test/Syd/AroundSpec.hs:145
  ✗ 2 Test.Syd.AroundSpec.behavior with exceptions.beforeAll.should not kill the test suit
      Retries: 3 (does not look flaky)
      user error (error in beforeAllWith)
      CallStack (from HasCallStack):
        error, called at src/Test/Syd/Runner/Asynchronous.hs:254:53 in sydtest-0.15.1.1-inplace:Test.Syd.Runner.Asynchronous

  Examples:                     7
  Passed:                       1
  Failed:                       2
  Sum of test runtimes:         0.00 seconds
  Test suite took:              0.00 seconds

A few notes:

    modified:   src/Test/Syd/Runner/Asynchronous.hs
    modified:   src/Test/Syd/Runner/Synchronous/Interleaved.hs
    modified:   src/Test/Syd/Runner/Synchronous/Separate.hs

So my question is:

Thank you.

NorfairKing commented 1 month ago

Wow thank you for the detailed report!

So my question is:

  • Are you interested by an MR doing this change on the three runners
  • Or is this completly stupid and I miss the obvious point.

I think you got that exactly right. I haven't used modifiers that crash much so I hadn't run into this.

Yes, PR please.

guibou commented 3 weeks ago

Additionally, I observed the following behaviors related to exceptions:

E.g. the folliwng testsuite:

import Test.Syd
import Control.Concurrent
import Control.Exception

main = sydTest $ do
    it "is slow" $ do
      print "is slow"
      e <- try $ threadDelay 10_000_000
      print "end of delay"
      print (e :: Either SomeException ())
      case e of
        Left e -> throwIO e
        Right () -> pure ()
      pure ()

The test is supposed to wait for 10s. If C-c is sent once, it just restart the test:

./SydException +RTS -N         
Tests:"is slow"

^C"end of delay"
Left AsyncCancelled
"is slow"
"end of delay"
Right ()

However, if ctrl-c is sent twice, the runtime terminates the program without taking any exception handler into account:

./SydException +RTS -N
"is slow"
Tests:

^C"end of delay"
Left AsyncCancelled
"is slow"
^C

(That's a documented GHC rts behavior, but I'm unable to find the documentation).

I suppose that sydtest is catching ALL possible exception and retries. From my PoV, it should ignore the asyrc ones and terminate the test runner immediatly.

Note that it is surprising that if the test suite survives one retry, it does not display the final results:

./SydException +RTS -N         
"is slow"
Tests:

^C"end of delay"
Left AsyncCancelled
"is slow"
<end of command, no feedback>
NorfairKing commented 3 weeks ago

Wow cool finds! The retry logic was added after the async exception handling so I'm not surprised it doesn't work well together.

guibou commented 3 weeks ago

I found the problem with shouldBe.

import Test.Syd

data Cheval = Cheval String deriving (Eq)

instance Show Cheval where
  show x = error "NO"

main = sydTest $ do
  it "fails" $ do
    Cheval "troll" `shouldBe` Cheval "troll2"

Here, the shouldBe will fail because values are different, but the Show instance will raises an exception and crash the testsuite during the final report:

./SydException +RTS -N         
Tests:

✗ fails                                                                    0.13 ms
  Retries: 3 (does not look flaky)

Failures:

    SydException.hs:13
  ✗ 1 fails
      Retries: 3 (does not look flaky)
SydException: NO
CallStack (from HasCallStack):
  error, called at SydException.hs:10:12 in main:Main

(No final report).

guibou commented 3 weeks ago

So, my plans in a MR about exceptions will be to:

a) Add an exception handler when sydtest displays the result of the tests and uses values from the "user side", which may contain hidden exceptions b) Add lazyness on before so exception will be hidden inside the resource and will be raised during usage in the different tests b') Maybe investigate another implementation which just disable the complete subtree of test if a before fails. Maybe cleaner than storing exception in a pure value. c) check the different exception handler (especially the ones which recover, such as try) for correct async behavior.

guibou commented 3 weeks ago

I've wrote the following test suite:

{-# LANGUAGE NumericUnderscores #-}
module Test.Syd.ExceptionSpec where

import Test.Syd
import Test.Syd.OptParse (defaultSettings)
import Data.Time.Clock
import Control.Concurrent (threadDelay)
import Control.Concurrent.Async

runAndCheckExceptions spec = do
  resultForest <- sydTestResult defaultSettings spec
  pure resultForest

data Cheval = Cheval String deriving (Eq)

instance Show Cheval where
  show x = error "NO"

spec = describe "exception handling" $ do
  it "works with exception in before" $ do
    res <- runAndCheckExceptions $ describe "behavior with exceptions" $ do
      -- This error in before all should appears as an error in the final
      -- report and not a crash of the test suite.
      beforeAll (fail "error in before all") $ do
        it "should not kill the test suit" $ do
          1 `shouldBe` 1

    pure ()
  it "works with exception in shouldBe" $ do
    res <- runAndCheckExceptions $ do
      describe "behavior with exceptions" $ do
        it "should not kill the test suit" $ do
          -- This should be won't raise the exception during comparison
          -- (because `==` is fine). However, the `Show` instance for `Cheval`
          -- is broken.
          --
          -- sydtest will try to display the actual/expected diff using the
          -- show instance, and should not crash.
          Cheval "troll" `shouldBe` Cheval "troll2"

    pure ()

  it "stops immediatly with async" $ do
    -- Tests that when an async exception is sent to sydTest (for example,
    -- ctrl-c), it behaves as expected and terminates immediately and does not
    -- do anything surprising.
    startTime <- liftIO $ getCurrentTime

    -- Runs two threads, one will be done in 1s, and the other is a test suite
    -- with one test which should run for 10s.
    -- When the first threads terminate, it will throw AsyncCancel in the
    -- sydTest test suite, which should terminate asap.
    -- This will be checked using the timer t. That's fragile, but we know that:
    --
    -- t should be more than 1s (because of the first threadDelay)
    -- t should be no much more than 1s. Especially, it should not be 10s
    -- (waiting for the complete threadDelay in the test suite) or even more if
    -- the exception is completly ignored (or test is retried)
    race (threadDelay 1_000_000) $ do
      sydTest $ do
          it "is slow" $ do
            threadDelay 10_000_000

    endTime <- liftIO $ getCurrentTime

    liftIO $ print $ startTime `diffUTCTime` endTime
    pure ()

Which shows the problems and should hopefully be fine once the problems are fixed.

Note that I also observed that:

However, I'm surprised, because tests are run in applyWrapper2, which uses catches with the following handlers:

exceptionHandlers :: [Handler (Either SomeException a)]
exceptionHandlers =
  [ -- Re-throw AsyncException, otherwise execution will not terminate on SIGINT (ctrl-c).
    Handler (\e -> throwIO (e :: AsyncException)),
    -- Catch all the rest
    Handler (\e -> return $ Left (e :: SomeException))
  ]

So as a first observation, it seems that the problem was already anticipated and taken care of. I need to read the precise spec of catches now to understand what is happening.

guibou commented 3 weeks ago

@NorfairKing

AsyncException is a class of exception which does NOT contain all asynchroneous exception. We need to use SomeAsyncException (or a wrapper library, such as https://hackage.haskell.org/package/safe-exceptions, which takes care of that for us)

This fix is trivial, AsyncException -> SomeAsyncException. MR pending.

NorfairKing commented 3 weeks ago

@guibou I'm impressed that you're run into so many of these issues already. I'm open to any improvements, especially when you have such a nice test suite along with them.