Open keathley opened 6 years ago
Thanks for the write up @keathley!
I am not that familiar with stateful checking so please take my feedback with a grain of salt. The examples above seem to have a lot of repetition and I don't like that. The ScalaCheck approach probably helps remove this repetition but imposing a module per command would not be very idiomatic. As I hinted on IRC, I would prefer each approach to rely on data structures and function composition.
Here is how I would rewrite the model bit with this in mind:
def command(%{conn: :empty}) do
# A simple command with the atom ok as constant generator.
:ok
|> command(fn :ok -> start_redis() end)
|> next_state()
end
def command(%{conn: conn, contents: contents}) do
keys = Map.keys(contents)
# preconditions, next state and post conditions are attached to
# commands only if necessary.
get =
one_of(keys)
|> command(fn key -> Redix.command(conn, ["GET", key]) end)
|> precondition(&continue_if_not_empty_conn/2)
|> post_condition(fn state, key, result -> Map.fetch(state.contents, key) == result end)
set =
{binary(), binary()}
|> command(fn {key, value} -> Redix.command(conn, ["SET", key, value]) end)
|> next_state(&store_key/3)
|> post_condition(fn _state, _key, result -> result == {:ok, "OK"} end)
set_existing =
{one_of(keys), binary()}
|> command(fn {key, value} -> Redix.command(conn, ["SET", key, value]) end)
|> next_state(&store_key/3)
|> post_condition(fn _state, _key, result -> result == {:ok, "OK"} end)
one_of([get, set, set_existing])
end
defp continue_if_not_empty_conn(%{conn: conn}, _key), do: conn != :empty
defp store_key(state, {key, value}, _result), do: put_in(state.contents[key], value)
This is not a complete rewrite but it should get the discussion moving. Because it is just function and data, we can pass around commands and build them in many different ways.
Running the commands is a matter of passing the initial state and a the reference to function that generates the commands, in this case, command/1
.
Some remaining topics are discussed below.
One of the questions I have is why do we pre_conditions if the commands we generate are already tailored to the given state? For example, would we ever reach the precondition below, given we don't generate Redix commands when the conn is empty?
def precondition(%{conn: :empty}, {:call, Redix, _, _}), do: false
Given we put the execution of the code inside closures, there is a chance we no longer need symbolic variables (again, grain of salt here, I likely don't know what I am talking about). But in a nutshell, if a command needs to store or read something previously stored, we could simply pass the state to the comamnd function:
one_of(keys)
|> command(fn state, key ->
result = Redix.command(conn, ["GET", key]) end)
{result, put_in(state.temp1, result)}
end)
I am not sure if storing symbolic variables in the state is a good idea because they may have different livecycles. If storing it in the state is a bad idea, we could introduce something such as a session, that would have something closer to the semantic variables lifecycle. Somebody with more understanding has to chime in here with the semantics and the operations that need access to semantic variables.
One of the nice things about symbolic calls is that it is easy to generate a trace in case the test fails. For example:
[{set,{var,1},{call,proc_reg_eqc,spawn,[]}},
{set,{var,2},{call,proc_reg,where,[c]}},
{set,{var,3},{call,proc_reg_eqc,spawn,[]}},
{set,{var,4},{call,proc_reg_eqc,kill,[{var,1}]}},
{set,{var,5},{call,proc_reg,where,[d]}},
{set,{var,6},{call,proc_reg_eqc,reg,[a,{var,1}]}},
{set,{var,7},{call,proc_reg_eqc,spawn,[]}}]
In the proposal above, all of the operations are hidden behind anonymous functions, which means we don't get this understanding. We don't even have a name we can refer to the commands.
One possible solution is to require each command to have a name. Something like:
:get
|> command(one_of(keys), fn key -> Redix.command(conn, ["GET", key]) end)
|> ...
Now, in case of failures, we can say the command name and generated args:
[
{:get, "foo"},
{:set, {"bar", "baz}}
]
Another idea is to make command
a macro and rely on do blocks:
:get
|> command(one_of(keys)) do
key -> Redix.command(conn, ["GET", key]) end)
end
|> ...
This would allow us to show the code that reproduces the failure - up to some extent, things like conn
would still need to be extracted from the state, at least in the example above. The downside is that we now have macros and you can no longer pass functions references to command
.
This is what I have so far. Feedback is welcome!
Pre-conditions are needed because during shrinking commands will be removed from the list of commands that caused a failure. Removing these commands means that we may create a command sequence that is "invalid". Pre-conditions ensure that we don't attempt to run "invalid" commands during the shrinking process.
I've been reading through the QuickCheck papers, "QuickCheck Testing for fun and profit" and "Testing Telecoms Software with Quviq QuickCheck" in order to better understand the benefits of using symbolic calls. Here's a quote from "QuickCheck Testing for fun and profit" that I think sums it up nicely:
We made an early decision to represent test cases symbolically, by an Erlang term, rather than by, for example, a function which performs the test when called. Thus if a test case should call unregister(a), then this is represented by the Erlang term {call,erlang,unregister,[a]}—a 4-tuple containing the atom call, the module name and function to call3, and a list of arguments. The reason we chose a symbolic representation is that this makes it easy to print out test cases, store them in files for later use, analyze them to collect statistics or test properties, or—and this is important—write functions to shrink them.
I think it would be ideal to use symbolic calls under the hood without having to expose them to the end user. I like the idea of co-locating all of the conditions for a call around a data structure like in your example @josevalim. IMO having to pass a generator into the command is awkward but I guess there's no way around that since we're using streams under the hood. Ideally we could just do something like this:
def get(key) do
:get
|> command()
|> run(fn state -> Redix.command(state.conn, ["GET", key]) end)
|> precondition(fn state -> state.conn != :empty end)
|> next_state(fn state -> state end)
|> postcondition(fn state, result -> result == Map.fetch(state.contents, key) end)
end
def set(key, value) do
:set
|> command()
|> run(fn state -> Redix.command(state.conn, ["SET", key, value]) end)
|> precondition(fn state -> state.conn != :empty end)
|> next_state(fn state -> put_in(state, [:contents, key], value) end)
|> postcondition(fn _state, result -> result == {:ok, "OK"} end)
end
I don't have a strong preference between using something like command
vs. run
to distinguish the actual command code; it just read better to me. This API would allow us to generate the list of calls (as command structs or something similar) and retain all of the benefit of the symbolic language without forcing users to have to deal with it.
@keathley re: preconditions vs state, thanks for the clarification.
Here's a quote from "QuickCheck Testing for fun and profit" that I think sums it up nicely
Great job digging it up.
IMO having to pass a generator into the command is awkward.
Interesting. I actually like it a lot because I find the symbolic call API very confusing. It is a mixture of special tuples (such as :call
), arguments that may be generators or not and straight-up values. I think having boundaries around those is nice.
If I understand your proposal correct, you moved the whole definition of the command after the generator is called. If that's the case, I would move the command
call out, something like:
def command(%{conn: :empty}) do
command(__MODULE__, :started_redis, [])
end
def command(%{conn: conn, contents: contents}) do
keys = Map.keys(contents)
# The args in command is always a list of generators
one_of([
command(__MODULE__, :get, [one_of(keys)]),
command(__MODULE__, :set, [binary(), binary()]),
command(__MODULE__, :set, [one_of(keys), binary()])
])
end
def get(key) do
fn state -> Redix.command(state.conn, ["GET", key]) end
|> precondition(fn state -> state.conn != :empty end)
|> postcondition(fn state, result -> result == Map.fetch(state.contents, key) end)
end
def set(key, value) do
fn state -> Redix.command(state.conn, ["SET", key, value]) end
|> precondition(fn state -> state.conn != :empty end)
|> next_state(fn state -> put_in(state, [:contents, key], value) end)
|> postcondition(fn _state, result -> result == {:ok, "OK"} end)
end
Does this make sense?
A command can be a simple anonymous function or something that we build after piping through precondition/next_state/post_condition.
I am still not sure if something needs to be done about symbolic variables.
I like the idea of having command/3
that takes a module, a function, and a list of generators that will be used to generate a list of arguments.
However I am not sure about having commands be a fun that is enriched with precondition, next state, postcondition. Do we always need all three? If we do, maybe we should think about a way of ensuring that and making it easier to do and understand.
I also wanted to mention that I am still not sure about this approach of having precondition and postcondition for each command is not more repetitive than having precondition/postcondition hooks like the original proposal. I guess the original proposal resembles a fsm somehow, where you have a function that decides with what precision to handle commands and states. Example from the initial proposal: we might not want to ever issue any command except "connect" against a disconnected Redis instance. I can't form a preference between the two approaches yet though.
I also wanted to mention that I am still not sure about this approach of having precondition and postcondition for each command is not more repetitive than having precondition/postcondition hooks like the original proposal.
Just to clarify: precondition/postcondition should only be declared if necessary.
One question worth asking is if the state
in precondition/next_state/postcondition is always the same. If so, we can encode those things in a single function and use return tuples:
def command(%{conn: :empty}) do
command(__MODULE__, :started_redis, [])
end
def command(%{conn: conn, contents: contents}) do
keys = Map.keys(contents)
# The args in command is always a list of generators
one_of([
command(__MODULE__, :get, [one_of(keys)]),
command(__MODULE__, :set, [binary(), binary()]),
command(__MODULE__, :set, [one_of(keys), binary()])
])
end
def get(%{conn: :empty}, _key), do: :precondition_failed
def get(state, key) do
result = Redix.command(state.conn, ["GET", key])
assert Map.fetch(state.contents, key) == result
{:ok, state}
end
def set(%{conn: :empty}, _key, _value), do: :precondition_failed
def set(state, key, value) do
assert {:ok, "OK"} = Redix.command(state.conn, ["SET", key, value])
{:ok, put_in(state.contents[key], value)}
end
What I like in @josevalim's last example is the combination of executing the command, checking the postcondition via assert
and returning a probably modified state. We do not need to return a tuple, since there is no reason to return something instead of :ok
: Either the postcondition fails, then assertions blows or everything is ok by definition. This makes it really short and to the point.
But this does not work. During the symbolic execution (ie. command sequence generation), the next_state
function decides how the state changes after a working command call. Therefore we must distinguish between real command execution with the System Under Test and the modification of the current test status, which is handled in next_state
. These two separate phases of command generation and command execution are shown nicely here http://propertesting.com/book_stateful_properties.html#_execution_model.
In PropEr/PropCheck all these callbacks are general functions wich distinguish via the arguments which command is currently under consideration. This is the same pattern in GenServer
. EQC nowadays uses a different approach, which groups these callbacks by command, which is an atom (see http://quviq.com/documentation/eqc/eqc_statem.html). In the spirit of a modern EQC state machine we could formulate it this way:
def get_args(_state), do: fixed_list([constant(state.conn), keys()])
def get_pre(%{conn: :empty}), do: false
def get_pre(_state), do: true
def get_next(state, _args, _result), do: state
def get_post(state, [_conn, key], result) do
assert Map.fetch(state.contents, key) == result
end
def get(conn, key) do
Redix.command(conn, ["GET", key])
end
# generate the arguments for the command
def set_args(state), do: fixed_list([constant(state.conn), keys(), values()])
def set_pre(%{conn: :empty}), do: false
def set_pre(_state), do: true
def set_post(_state, _args, result), do: assert result == {:ok, "OK"}
def set_next(state, [_conn, key, val], _result) do
# post condition is true, result may be symbolic
put_in(state.contents[key], value)
end
def set(conn, key, val) do
Redix.command(conn, ["SET", key, val])
end
What I like in this approach is that the code for one command stays together. This makes it easier to understand the model piece by piece. Using only regular functions reduces the magic for test code writers. But it feels a little bit off from an Elixirish point of view. How about applying nested definitions as in ExUnit's describe
:
specify :get do
def args(_state), do: fixed_list([constant(state.conn), keys()])
def pre_condition(%{conn: :empty}), do: false
def pre_condition(_state), do: true
def next_state(state, _args, _result), do: state
def post_condition(state, [_conn, key], result) do
assert Map.fetch(state.contents, key) == result
end
def execute(conn, key) do
Redix.command(conn, ["GET", key])
end
end
The specify
macro would simply add the command name (here get
) as a prefix to internally defined functions.
This way, we can add more optional features in to the command specification later on. Weights for commands are one example, so commands are not generated by a one_of
but by frequency
to execute some commands more often than others. In the EQC documentation are some more optional callbacks such as features, shapes, return, adapt, ... which might be interesting when we learn more about stateful testing with StreamData.
Thanks @alfert for the input!
We do not need to return a tuple, since there is no reason to return something instead of :ok
I returned a tuple because the command may also return :precondition_failed
.
But this does not work. During the symbolic execution (ie. command sequence generation), the next_state function decides how the state changes after a working command call.
Apologies but I don't see how this is not possible with the proposal so far. In the proposal, the goal was to embed the working command call and next state in a single step. The state is updated only if the command succeeds. Given the graph you referenced, the only thing that is not possible to distinguish is an error in the command itself vs an invalid postcondition. However, it is unclear if any of the libraries use this information in any meaningful way. If they do, we could also allow the command to return :postcondition_failed
.
The specify macro would simply add the command name (here get) as a prefix to internally defined functions.
On IRC, we have mentioned that the best approach would be to avoid DSLs and rely mostly on data and functions/callbacks. :)
In any case, it is interesting to see that EQC also went to the direction where the knowledge is grouped by the command (get/set) and not by the state machine (pre/post/next_state).
Apologies but I don't see how this is not possible with the proposal so far. In the proposal, the goal was to embed the working command call and next state in a single step. The state is updated only if the command succeeds. Given the graph you referenced, the only thing that is not possible to distinguish is an error in the command itself vs an invalid postcondition.
The problem I see is that the next_state
function is called twice: First during symbolic execution while generating the commands and secondly while executing the commands. In both cases, next_state
decides how the model evolves. If you glue test execution with model update, then you have either to implement the model update twice or you execute the system under test during call generation which does not work by design.
But you are right that my comment is a little too harsh. Melting command execution and checking the post condition together via assert
is indeed a perfect match and removes the need for an explicit post_condition
function. The existence of post_condition
is apparently an heritage of the boolean property approach, which is not used in StreamData any longer due to its nicer assert
macros. But I would suggest to move the state update into its own function (i.e. next_state
) to avoid to implement it twice.
Just to give an example where this matters: In an extension of the Redis example we could implement a flush command, which drops the entire database. This means that model state must also be empty thereafter. So any call to get
must fail until we start again with calling set
to fill the database. This must be recorded in the generation phase and executed and experienced later during test execution. Granted that in the current Redis example this is not reflected, as nobody cares about the values stored in Redis and the model unless they are the same. Closing the connection would be another example where the state is crucial for command generation. To summarise, preconditions (and in EQC and PropCheck also the weights) depend on the state and thus the command generation depends also on the state, so the state should be changeable during the generation phase and not only during the execution phase.
On IRC, we have mentioned that the best approach would be to avoid DSLs and rely mostly on data and functions/callbacks. :)
Good to hear that! Seems that I should to be at least a lurker at IRC (where I never was before...) to be up to date :)
Just to give an example where this matters: In an extension of the Redis example we could implement a flush command, which drops the entire database. This means that model state must also be empty thereafter. So any call to get must fail until we start again with calling set to fill the database
From my understanding - and please keep in mind that I understand very little about stateful models - this implies we first generate the commands and apply them only later. Couldn't we apply the commands as we generate them instead? Or would we miss some important feature? Maybe doing so would complicate space exploration or shrinking?
In any case, I think this shows there is a dichotomy between the system under test and the model that is not well-defined in the latest proposals.
Good to hear that! Seems that I should to be at least a lurker at IRC (where I never was before...) to be up to date :)
Btw, I don't think IRC is required. It is probably a good idea to document here the discussions done there anyway.
From my understanding - and please keep in mind that I understand very little about stateful models - this implies we first generate the commands and apply them only later. Couldn't we apply the commands as we generate them instead? Or would we miss some important feature? Maybe doing so would complicate space exploration or shrinking?
The reason for splitting the generation and execution phase is enabling shrinking and understanding what is going on. The idea is that we generate a list of calls of the system under test. This is one test case for the property. After the generation, we execute this list. If an error happens, then this particular list is shrunk towards a smaller list. Generation phase, shrinking and execution phase all depend on the model. Then the next list of calls is generated and executed and so on. If an error happens, we can see the history of (symbolic) calls to the system under test and how in parallel the model evolves. This helps to understand the error. To get a nice reporting with StreamData is open issue, in particular if the command list is rather long.
The very basic idea of the entire approach is that the model reveals relevant features of the system under test. You can have several models for the same system for analysing different aspects, e.g. making the model gradually more complex. The idea behind it is the same as a bisimulation. If the system under test behaves as the model predicts, then the system under test works fine.
Thank you, @alfert! Do you think checking for preconditions is a property of the model or of the SUT?
Isn't only the postcondition check a property of the SUT? The point of keeping the model is to not have to check anything from the system except for postconditions afaiu?
Apologies but I am getting a bit confused here.
In both the Proper/EQC example, the next_state
hook receives the value of the execution, so doesn't it mean we need to execute in order to retrieve the next_state
?
@josevalim Do you think checking for preconditions is a property of the model or of the SUT?
Checking the precondition is required during the generation phase to generate useful calls. Think about a system that allows to delete items. The precondition would check that such an item does exist within the system. I.e. the model must represent that the system has stored this item and therefore the generated call is legal/sensible/useful. It should trivially hold during execution time and crucial for shrinking because it may happen that shrinking modifies the system and the model state in way that the formerly legal call is no longer legal.
@whatyouhide Isn't only the postcondition check a property of the SUT?
Yes, exactly. The postcondition checks wether the system is in an proper state after executing the call. I.e. the return value of the call must be expected and derivable from the model. Here we can use assert
instead of boolean return values.
@josevalim In both the Proper/EQC example, the next_state hook receives the value of the execution, so doesn't it mean we need to execute in order to retrieve the next_state?
Now I am confused. What do you mean by execute in order? Perhaps this helps: Since the state is not stored during generation (this would not be useful, because results of symbolic calls could be part of the state but they do not have a meaning), the state must be replayed during execution time and even more important during shrinking since shrinking modifies the initially generated list of commands.
Now I am confused. What do you mean by execute in order?
What I meant to ask is this: because the next_state
function receives the execution result, it means the initial generation of commands needs to happen side by side with the execution, because otherwise we don't have the result to compute the next_state
. I guess for shrinking we could use the results from execution but it is also expected that once we change the order or remove commands during shrinking, the results will change, and by then the post condition may be necessary to check the results are still valid.
I plan to take a look at the state machine implementation in proper to help understand those questions.
Ah, I see your point. It is different: during generation you simulate the calls to the system, but you use next_state to record the expected state. In next_state you cannot evaluate the result of the symbolic calls. You can only rely on the arguments, the current model state and the expected result from the call (executed later in the future). That is the reason why only the post condition evaluates the result which is only called in the execution phase.
For Proper, there is also the tutorial with the movie server explaining how to specify a model with a complex state (i.e. including deletion): http://proper.softlab.ntua.gr/Tutorials/PropEr_testing_of_generic_servers.html This might help to understand the approach.
Von unterwegs gesendet
Am 06.04.2018 um 08:47 schrieb José Valim notifications@github.com:
Now I am confused. What do you mean by execute in order?
What I meant to ask is this: because the next_state function receives the execution result, it means the initial generation of commands needs to happen side by side with the execution, because otherwise we don't have the result to compute the next_state. I guess for shrinking we could use the results from execution but it is also expected that once we change the order or remove commands during shrinking, the results will change, and by then the post condition may be necessary to check the results are still valid.
I plan to take a look at the state machine implementation in proper to help understand those questions.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.
I want to address some of the concerns / points about repetition in the model code. In my experience models involve a non-trivial amount of code. As a reference point a basic model for testing our raft implementation is ~300 loc. Based only on experience and intuition, grouping preconditions, postconditions, and next state generations with each "command" won't help reduce repetition. You almost always need a precondition for a command. The same is true for postconditions. Defining a single callback like:
def precondition(%{conn: :empty}, {:call, Redix, _, _}), do: false
Is going to be less lines of code then needing to specify the same precondition for each command like:
def connected?(%{conn: :empty}), do: false
def connected?(_), do: true
:get
|> command
|> precondition(&connected?/1)
:set
|> command
|> precondition(&connected?/1)
The benefit to grouping isn't that there's less lines of code. Its that the meaningful pieces of code are located together. The co-location would potentially make the model and all of the preconditions and postconditions easier to reason about once the model has grown to a reasonable size.
because the next_state function receives the execution result, it means the initial generation of commands needs to happen side by side with the execution, because otherwise we don't have the result to compute the next_state. I guess for shrinking we could use the results from execution but it is also expected that once we change the order or remove commands during shrinking, the results will change, and by then the post condition may be necessary to check the results are still valid.
During the generation phase, PropEr and EQC only pass in a symbolic value to next_state
, usually looking like {var, N}
but could also be the result of calling make_ref()
. What you get is an opaque placeholder for what the system would have produced at run-time. If your model makes use of that value, it must make so as an opaque term.
This means, for example, that if you call gen_server:start_link(...)
, you get an opaque term T
and you cannot extract the pid from a {ok, Pid}
tuple result. You just can't. Most people then when they call the remaining functions for the model can therefore not pass in T
to a {call, my_server, some_call, [T]}
because mostly none of them accept {ok, Pid}
as a value, just Pid
.
The common pattern is to create a shim module that will wrap common functions, i.e.
start_link(Args) ->
{ok, Pid} = my_server:start_link(Args),
Pid.
Then the model can use the opaque value T
as if it were an actual Pid
returned from the system (as long as you don't do things like is_pid(T)
since that will fail.
My personal opinion from my current understanding of stateful property testing is that if you are generating command sequences with symbolic call (so they can be recorded to disk, manipulated as data, and so on), you end up requiring some user-provided glue to make the symbolic execution work with the real one, and pretending that only one execution takes place is not going to work.
If you decide to generate things differently so that generation and execution happen at the same time, that can possibly work, but shrinking may be more difficult because modifications to the existing chain will not be deterministic.
Take for example a sequence where I generate a command based on a random value provided by the server under test. If the value is in the interval [0.0 .. 0.5[
, I generate a command to upgrade to a paid account. I the value is in the interval [0.5..1.0]
, I generate a command to delete my account.
Under PropEr and quickcheck, this cannot be done since the symbolic execution would not let me use the server-returned value to influence command generation. Under a mechanism where server-returned values can be used to influence command generation, then it's free game and there's no way to know what I'll get back.
You have to be kind of careful about how you approach determinism there.
I just finished a first version of an EQC like grouping with lots of debug output in Repo https://github.com/alfert/counter:
statem.ex
https://github.com/alfert/counter/blob/master/test/statem_test.exsSome remarks on the current state of the implementation:
if
statements. _pre
and _post
as true
and of _next
as identity, a better extraction of existing functions from .exs
files (I don't like the reloaded module log output), more type specs and more documentation. I am interested in your feedback. Have a nice weekend!
Since its been a bit I thought I would provide an update from my side. Currently we're just using proper for all of our stateful testing. Its working well for us so I'm not sure that we need this functionality to exist in stream_data. I'm happy to weigh in with opinions if that is helpful but otherwise I'm not going to be pursuing this further. Thanks for all the input and effort :heart:.
@alfert I have exported your implementation as separate library https://github.com/hauleth/stream_state. I hope that you do not mind. I also have done some improvements:
module.module_info/1
function*_pre
and *_post
better failure messages:
Postcondition failed.
Command: StreamStateTest.inc() State: :one Result: 6
History: StreamStateTest.clear() => :ok StreamStateTest.inc() => 1 StreamStateTest.inc() => 2 StreamStateTest.inc() => 3 StreamStateTest.inc() => 4 StreamStateTest.inc() => 5
TBD:
*_next
~ already done
I'm opening this issue based on discussions in IRC to start working towards a proposal for stateful model checking powered by stream_data.
Existing solutions
I'll try to provide an overview of the current solutions. I'm going to use Proper since thats what I'm most familiar with. Proper and EQC have almost identical apis but EQC provides extra features. Proper is a good baseline though. I'm by no means an expert on proper so I may get some of these details wrong. I'm sure @fishcakez can correct me though ;).
Lets say that we wanted to test getting and setting keys in redis. To do that we can use a state machine test. This involves setting up a test and then implementing some callbacks. I've added documentation to each callback to try to provide context on how it works. If you want to run the test locally the full example is here: https://github.com/keathley/redix_model.
The way that proper works is by first generating a list of symbolic calls based on the state space specified by the
command
andnext_state
callbacks. After proper has generated the symbolic calls it executes each command and checks postconditions. If it finds an error it shrinks the generated commands list. The most effective means of doing this is by removing commands. This is one of the reasons we need our preconditions. Its likely that by removing certain commands we'll end up with a command that is invalid based on our current state. We use preconditions to avoid that.Separating the generation and execution of commands means that shrinking is easier to control and somewhat more deterministic from what I understand.
A new api?
I've only done model checking using the "EQC" api so I'm not sure that I have a novel way of approaching the problem from an api standpoint. ScalaCheck uses similar semantics but each command is a class and provides its own preconditions, postconditions, and next state functions. That division seems like it would help keep things tidy as a model grows. I think it would be ideal to use simple functions but its been my experience that models can get quite large so it would be nice to have a more prescribed way to manage them.
I hope that this helps provide some context. I'm very excited to see what y'all think because its been my experience that model checking is where property testing really shines.