Closed RenateEilers closed 7 months ago
Indeed, there is no way to test the final state of the system after the tests are run. One could find a way to do so in sequential properties by the following pseudocode:
prop_sequential = do
env <- initEnv
let sm = ... env
prettyCommands =<< runCommands sm
checkProperty env
For example one could include a tracer in the environment and check it after the commands are run. However, as made clear by #12 the only way to provide a clean isolated environment for each parallel test case is by initializing the state machine monadically using the WithSetup
variants. However this precisely means that such environment cannot escape the scope of the state machine.
For the sake of uniformity, this could be added to the state machine as follows:
sm :: m (StateMachine model cmd m resp)
sm = do
env <- initEnv
pure $ StateMachine {
...
semantics = semantics env
, finalProperty = checkEnv env
...
}
Then as @RenateEilers says, it should check the property after the execution of all commands, and give the result to prettyParallelCommands
which would then show whether either an interleaving was not explainable or/and the final property holds.
This ticket is again solving a problem specific to parallel properties where the solution could be defined for all StateMachines just for uniformity among parallel and sequential properties.
Any first impressions about this @stevana ? Should we go forward and make a PR?
Can you give a full example? (You'll need to write this for the testsuite eventually anyway).
I don't like how we keep extending the StateMachine
record to handle various corner cases, this seems to suggest that the API we got isn't flexible enough. Can you think of ways, perhaps inspired by the APIs of other state machine based testing libraries, which simplify the API (rather than complicate it) while solving your problem?
EDIT: after writing this I realized it still might be too verbose. If it is unclear please request further clarification.
A full example would be as follows (taken from here). In that code the relevant chunks is:
test cmds' = do
...
(tracer, getTrace) <- recordingTracerIORef
nodeDBs <- emptyNodeDBs
...
(hist, model, res, trace) <- bracket
(open args >>= newTVarIO)
-- Note: we might be closing a different ChainDB than the one we
-- opened, as we can reopen it the ChainDB, swapping the ChainDB in
-- the TVar.
(\varDB -> readTVarIO varDB >>= close)
$ \varDB -> do
let env = ChainDBEnv
{ varDB
, registry = iteratorRegistry
, varCurSlot
, varNextId
, varVolatileDbFs = nodeDBsVol nodeDBs
, args
}
sm' = sm env (genBlk chunkInfo) testCfg testInitExtLedger maxClockSkew
(hist, model, res) <- QSM.runCommands' (pure sm') cmds'
trace <- getTrace
return (hist, model, res, trace)
...
let
modelChain = Model.currentChain $ dbModel model
prop =
counterexample ("Model chain: " <> condense modelChain) $
counterexample ("TraceEvents: " <> unlines (map show trace)) $
tabulate "Chain length" [show (Chain.length modelChain)] $
tabulate "TraceEvents" (map traceEventName trace) $
res === Ok .&&.
prop_trace testCfg (dbModel model) trace .&&.
counterexample "ImmutableDB is leaking file handles"
(Mock.numOpenHandles (nodeDBsImm fses) === 0) .&&.
counterexample "VolatileDB is leaking file handles"
(Mock.numOpenHandles (nodeDBsVol fses) === 0) .&&.
counterexample "LedgerDB is leaking file handles"
(Mock.numOpenHandles (nodeDBsLgr fses) === 0) .&&.
counterexample "There were registered clean-up actions"
(remainingCleanups === 0)
return (hist, prop)
You can see there that we create a tracer from which we extract the final traces and assert stuff on what was recorded in that trace. Also for the different databases. This is currently undoable for parallel testing. The environment will have to be set up inside the state machine creation and this test should use the WithSetup
variants, to ensure that each parallel execution has a fresh environment as shown in #12.
We would like to be able to express the following:
sm env genBlock cfg initLedger maxClockSkew = do
(tracer, getTrace) <- recordingTracerIORef
nodeDBs <- emptyNodeDBs
pure $ StateMachine
{ initModel = ...
, transition = transition
, precondition = precondition
, postcondition = postcondition
, generator = ...
, shrinker = shrinker
, semantics = semantics env
, mock = mock
, invariant = ...
, cleanup = noCleanup
, finalProperty = \model -> do
let modelChain = Model.currentChain $ dbModel model
trace <- getTraces
fses <- atomically $ traverse readTVar nodeDBs
pure $
counterexample ("Model chain: " <> condense modelChain) $
counterexample ("TraceEvents: " <> unlines (map show trace)) $
tabulate "Chain length" [show (Chain.length modelChain)] $
tabulate "TraceEvents" (map traceEventName trace) $
res === Ok .&&.
prop_trace testCfg (dbModel model) trace .&&.
counterexample "ImmutableDB is leaking file handles"
(Mock.numOpenHandles (nodeDBsImm fses) === 0) .&&.
counterexample "VolatileDB is leaking file handles"
(Mock.numOpenHandles (nodeDBsVol fses) === 0) .&&.
counterexample "LedgerDB is leaking file handles"
(Mock.numOpenHandles (nodeDBsLgr fses) === 0) .&&.
counterexample "There were registered clean-up actions"
(remainingCleanups === 0)
}
Otherwise this test would be useless in a parallel testing scenario.
This is the best example we have so far. It is simply undoable because the environment lives only inside the state machine. A different approach would be to provide an env -> Property
function to the prettyParallelCommands
and then return the environment when creating the state machine, which is (I think) more convoluted than what we are proposing.
EDIT: after writing this I realized it still might be too verbose.
Yes, please make the example standalone (not dependent on anything else). What you posted doesn't make sense to include in the testsuite.
Ok, will try to elaborate a self contained example in which this problem can be seen 👌 (perhaps not today, will try to do it soon)
https://gist.github.com/jasagredo/f0f3fea4b14811fb0df7fdf8d8dac91c
This modified version of the Echo test shows the problem. If I add a tracer and I want to check for example that the number of traces emitted is equal to the number of times I issued a In
command, I cannot do that in the parallel case, only on the sequential case.
❯ cabal run ./Echo.hs
Tests
Sequential: OK
+++ OK, passed 100 tests.
Parallel: FAIL
*** Failed! Falsified (after 2 tests and 2 shrinks):
ParallelCommands
{ prefix =
Commands
{ unCommands =
[ Command
(In
"K|\ETX\DC2=\163759p\128577\&7<\39542-I\SO\1083040\US\146733\&82 \992441C^5\\\133004\1043070;t\DLE\166773\RS\1011177\v@!{{hr\ETX\1083918\173352\1091277\SOH4\168379\&6q:>abm\35413w=\f?\10828\ACKT&\DC3\DC2Fs\156907(+\1029686QC^\EMuO\190205")
InAck
[]
]
}
, suffixes = []
}
More traces than in commands: 10 vs 1
Use --quickcheck-replay=900590 to reproduce.
Use -p '/Tests.Parallel/' to rerun this test only.
NParallel: FAIL
*** Failed! Falsified (after 3 tests and 2 shrinks):
ParallelCommands
{ prefix =
Commands
{ unCommands =
[ Command (In "\b\1038060VLDK\1109244on?") InAck [] ]
}
, suffixes = []
}
More traces than in commands: 10 vs 1
Use --quickcheck-replay=311423 to reproduce.
Use -p '/NParallel/' to rerun this test only.
2 out of 3 tests failed (0.01s)
See the error message, 10 traces vs 1 In
command, because of the 10 repetitions in the parallel case.
I hope this one is more clear, if not don't hesitate on asking for further clarification.
We have been discussing alternative ways on doing this and why it seems they wouldn't work.
A possibility is to have a last special command and provide this information we want to check in the response for the command. This poses two problems:
There is no way to say that said command MUST be generated at the end of any sequence of commands.
Such command should be terminal, and as such won't be runnable in parallel. Which in turn means that no parallel group could have this command, and as the execution of parallel state machines consists of a sequential prefix plus one or more parallel groups, none of those could have this command. Therefore it could only be checked in the sequential prefix, which defeats the whole purpose of parallel checking.
It could be doable to make all responses emit this information, such that we check it in every postcondition. This has two problems again:
Perhaps the property does not make sense to be checked at every post
condition. In the Echo example it makes sense that we check every time that
the number of In
commands is the same as the currently recorded counter,
but in the example in our codebase we want the information of the environment
just for output on the counterexample.
Sometimes the information we want to provide to a property is just observable for making the counterexample easier to understand, as in the case of just showing which traces have happened during the execution. If we were to compare these observable values with the ones produced in the model, unless they are all atomic there will not be a way to produce "interleaved" traces in the model, as it is advanced via a linearization.
In general I think this option would be doable sometimes, depending only in the nature of what one wants to check/observe.
By having a separate concrete field that exposes a final property, we can know that this will definitely be called at the end of the execution and only once, which decreases the amount of useless command sequences generated and reduces the very fine tailoring one would have to do for threading these intermediate "interleaved" information in the responses.
We discussed the possibility of using references to transfer this information from command to command while advancing the model, but we are not that used to working with References and perhaps this is not even the intended way of using them.
Disclaimer: I have only slightly adapted existing Hedgehog tests, I am not a Hedgehog expert.
The only other library that seems to provide parallel state machine tests is Hedgehog. https://hackage.haskell.org/package/hedgehog-1.4/docs/Hedgehog.html#g:5. There is also quickcheck-dynamic but it doesn't have support for parallel state machines.
Hedgehog has some different features than QSM, in particular I can see:
So I'm not sure we can take many inspiration from Hedgehog.
The proposal is to have a field finalProperty :: PropertyM m ()
in the StateMachine such that we can call it after the execution of the state machine, in each repetition of the parallel sequence of commands.
Special terminal command
I haven't had time to look into this in detail yet, but why doesn't the following work: forAllCommands ... $ \cmds -> let cmds' = cmds ++ [Terminal]
?
(The same can be done for parallel commands: forAllParallelCommands ... $ \cmds -> let cmds' = cmds { suffixes = suffixes cmds ++ [Pair [] [Terminal]] }
.)
If we introduce the command that way it ought to work. The Command
datatype would then have a constructor that must not be generated and that pollutes a bunch of instances (like shrinking and so) with useless cases, but yeah... I guess that would be doable?
Fixed in #43.
PR #12 introduced
withSetup
variants to run commands on monadic state machines, so that environments could be isolated between different parallel executions of a state machine.I'd like to test properties concerning the state of the environment at the end of every run. To do so, I propose to add a field
finalPostcondition :: Property
(alternative naming suggestions are welcome) toStateMachine
. We can then return this property at the end of executing commands, e.g. the return type ofexecuteNParallelCommands
would go fromm (History cmd resp, model Concrete, Reason)
tom (History cmd resp, model Concrete, Reason, Property)
. Then it could be propagated viarunNParallelCommandsWithSetup
and relevant variants toprettyNParallelCommands
where a sensible message could be printed upon failure.Feedback would be highly appreciated!
(Thanks @jasagredo for the idea)