Specifications can be terser through more type trickery

OlivierNicole commented 4 months ago

As I was writing an STM specification for Dynarrays, I was a bit frustrated to have to add a catch-all assert false at the end of postcond. So I went into an over-engineering rabbit hole and found that it can be avoided by making the cmd type of specifications a GADT, which allows to specify the result type for commands. The typechecker is then able to deduce the result type when pattern matching on commands.

As an example, I have rewritten src/atomic/stm_tests.ml to match the new Spec signature.

This has a number of benefits and some drawbacks.

Benefits

When writing the run function, results no longer need to be wrappend into the Res constructor. However, it is still necessary to also return an instance of 'a ty_show; I have not found a simpler way to make the results printable.
When writing the postcond function, it is no longer necessary to match on patterns like Res ((Result (Int, Exn), _), v) for the typechecker to deduce the right type for v: the command name suffices. It is also no longer necessary to add a catch-all _ -> assert false at the end.
Functions like run are safer because the result type is checked.

Drawbacks

Defining a GADT in the specification adds its own boilerplate: most functions now need explicit type annotations with universal quantifiers; the user must also define an existential type packed_cmd and wrap the command generators in it. Such STM tests are harder to write when one is not familiar with qcheck-stm.

And of course, another big drawback is that it changes the signature of STM specifications.

For these reasons, I open this PR as a draft, mostly to share the idea.

jmid commented 4 months ago

This is excellent Olivier, thanks a bunch! :pray: I was hoping to pull off something like this initially, but couldn't quite get the type-encoding working...

As much as I love and agree with the benefits and in particular the added safety, some of the additional user-input required sticks out to me, e.g., the 'a cmd vs packed_cmd distinction. I think I see why you need this:

a packed (existential) type for arb_cmd so that those can unify in the generator, and
a GADT 'a cmd so that we can exhaustively match in postcond and capture the typed result precisely in run

At the same time, as an end-user I probably don't want to care for such a cmd distinction... :shrug: QCheck's Test.make is an example of a hook that sweeps an existential type parameter under the rug for 95% of end-users. I would love if we could pull off something similar here.

I'm not too concerned about breaking the API at this point, if we can arrive at something nice. There are only few users outside this repo (lockfree and ortac-qcheck-stm) which we should be able to help migrate. This makes we wonder, whether it is possible to pull something like this off if we permit ourselves an API redesign - even a radical one.

Initially I went with arb_gen : cmd arbitrary because that allowed to capture both generation and printer in one (like QCheck). In a redesigned API, this is not set in stone. For QCheck2, the show function is decoupled from the generator.
A more radically different API could couple the generator, show-function, run, and pre/postcond to the individual command, similar to Hedgehog Haskell or Hedgehog Scala. Not sure that would be a good fit for OCaml though. Kotlin's propCheck decouples them more along the lines of STM... :thinking:

OlivierNicole commented 4 months ago

At the same time, as an end-user I probably don't want to care for such a cmd distinction... 🤷 QCheck's Test.make is an example of a hook that sweeps an existential type parameter under the rug for 95% of end-users. I would love if we could pull off something similar here.

I would also love to sweep the existential under the rug, but this seems to me to clash with the fact that the spec writer needs to write a command generator. Since, as you say, the typechecker needs to unify all possible command types there.

Perhaps we could hide it in a function similar to Test.make, but what would the type of that function be?

OlivierNicole commented 4 months ago

In the latest commit, the existential type is hidden behind a module with a nicer interface. The new compromise is that the sure must instantiate a small functor—see the new src/atomic/stm_tests.ml—and the distinction between 'r cmd and Cmd.any still manifests itself when writing the generator.

The result is maybe syntactically closer to the existing, and hopefully the user does not have to deal with the cryptic error messages that GADTs can give.

OlivierNicole commented 4 months ago

A good news is that the same tricks (not implemented in this PR, but I have it on the side) allow to get rid of the calls to Obj.magic in Lin: https://github.com/ocaml-multicore/multicoretests/blob/c9930453f1ab8ee38075c48182fe070dc454e05d/lib/lin.ml#L422-L427 without changing the tests that use the high-level Lin interface in any way. Tests that use Lin.Internal do have to be adapted (in much the same way as in this PR), but given the unstable nature of that interface, I feel this will be less of an issue.

ocaml-multicore / multicoretests

Specifications can be terser through more type trickery #456

Benefits

Drawbacks