Open OlivierNicole opened 4 months ago
This is excellent Olivier, thanks a bunch! :pray: I was hoping to pull off something like this initially, but couldn't quite get the type-encoding working...
As much as I love and agree with the benefits and in particular the added safety, some of the additional user-input required sticks out to me, e.g., the 'a cmd
vs packed_cmd
distinction. I think I see why you need this:
arb_cmd
so that those can unify in the generator, and 'a cmd
so that we can exhaustively match in postcond
and capture the typed result precisely in run
At the same time, as an end-user I probably don't want to care for such a cmd
distinction... :shrug:
QCheck's Test.make
is an example of a hook that sweeps an existential type parameter under the rug for 95% of end-users. I would love if we could pull off something similar here.
I'm not too concerned about breaking the API at this point, if we can arrive at something nice.
There are only few users outside this repo (lockfree
and ortac-qcheck-stm
) which we should be able to help migrate.
This makes we wonder, whether it is possible to pull something like this off if we permit ourselves an API redesign - even a radical one.
arb_gen : cmd arbitrary
because that allowed to capture both generation and printer in one (like QCheck). In a redesigned API, this is not set in stone. For QCheck2, the show
function is decoupled from the generator.show
-function, run
, and pre/postcond
to the individual command, similar to Hedgehog Haskell or Hedgehog Scala. Not sure that would be a good fit for OCaml though. Kotlin's propCheck decouples them more along the lines of STM
... :thinking: At the same time, as an end-user I probably don't want to care for such a
cmd
distinction... 🤷 QCheck'sTest.make
is an example of a hook that sweeps an existential type parameter under the rug for 95% of end-users. I would love if we could pull off something similar here.
I would also love to sweep the existential under the rug, but this seems to me to clash with the fact that the spec writer needs to write a command generator. Since, as you say, the typechecker needs to unify all possible command types there.
Perhaps we could hide it in a function similar to Test.make
, but what would the type of that function be?
In the latest commit, the existential type is hidden behind a module with a nicer interface. The new compromise is that the sure must instantiate a small functor—see the new src/atomic/stm_tests.ml
—and the distinction between 'r cmd
and Cmd.any
still manifests itself when writing the generator.
The result is maybe syntactically closer to the existing, and hopefully the user does not have to deal with the cryptic error messages that GADTs can give.
A good news is that the same tricks (not implemented in this PR, but I have it on the side) allow to get rid of the calls to Obj.magic
in Lin
:
https://github.com/ocaml-multicore/multicoretests/blob/c9930453f1ab8ee38075c48182fe070dc454e05d/lib/lin.ml#L422-L427
without changing the tests that use the high-level Lin
interface in any way. Tests that use Lin.Internal
do have to be adapted (in much the same way as in this PR), but given the unstable nature of that interface, I feel this will be less of an issue.
As I was writing an STM specification for Dynarrays, I was a bit frustrated to have to add a catch-all
assert false
at the end ofpostcond
. So I went into an over-engineering rabbit hole and found that it can be avoided by making thecmd
type of specifications a GADT, which allows to specify the result type for commands. The typechecker is then able to deduce the result type when pattern matching on commands.As an example, I have rewritten
src/atomic/stm_tests.ml
to match the newSpec
signature.This has a number of benefits and some drawbacks.
Benefits
run
function, results no longer need to be wrappend into theRes
constructor. However, it is still necessary to also return an instance of'a ty_show
; I have not found a simpler way to make the results printable.postcond
function, it is no longer necessary to match on patterns likeRes ((Result (Int, Exn), _), v)
for the typechecker to deduce the right type forv
: the command name suffices. It is also no longer necessary to add a catch-all_ -> assert false
at the end.run
are safer because the result type is checked.Drawbacks
Defining a GADT in the specification adds its own boilerplate: most functions now need explicit type annotations with universal quantifiers; the user must also define an existential type
packed_cmd
and wrap the command generators in it. Such STM tests are harder to write when one is not familiar with qcheck-stm.And of course, another big drawback is that it changes the signature of STM specifications.
For these reasons, I open this PR as a draft, mostly to share the idea.