Closed gasche closed 6 years ago
Yes, I (finally!) agree this is useful. Crowbar had bind for a while, then it got removed when we didn't see a use for it, but this is a compelling example.
I agree that bind
is more natural than join
here. I'm not convinced there's much point to join
.
I do think it's important to discourage use of bind
when map
suffices. There's an alternative fuzzing engine in the hype
branch (named after Hypothesis, from which it borrows some ideas), which makes heavy use of the static structure of generators, allowing it to mutate test cases by e.g. swapping subcases that are built with the same generator. This doesn't work with bind
, because of the lack of static structure, so usual monadic code like a >>= fun a -> b >>= fun b -> ...
will perform much worse than map [a; b] (fun a b -> ...)
.
The technique I tried to discourage use was choosing a longer name:
val depending_on : 'a gen -> ('a -> 'b gen) -> 'b gen
but some explanatory comments would be useful.
Regarding the name, what about dynamic_bind
?
I can try to write some documentation to explain clearly why this operator should be discouraged.
I will also review my implementation to see if having a Bind
combinator (instead of defining bind from a primitive Join
) makes things better (eg. in terms of having a nice printing behavior by default) or, on the contrary, more complex (the latter would be my guess).
There is also the question of whether such a Bind
should get an n-ary interface like Map
, but I would rather bet that the simple unary interface suffices for an uncommon operator.
@stedolan I updated the PR. I used dynamic_bind
, and wrote an overly-verbose documentation comment to discourage its use. Feel free to reword it (after a merge, or by writing a suggested rewording here). In the end I did use a Bind
constructor, as the choice of derived printer is actually more natural for Bind
than for Join
-- but the printing does not change.
I also restructured the example a bit, by adding a printer for generated type representations and a printer for generated values, as I found these two features very useful in my actual use-case, and I would like to encourage users to take inspiration.
Before, tweaking the PR to break the serializer or deserializer would print the following test failure (for example):
When given the input:
(#3 #3 #1 ?) => [[true; false; false]]
the test failed:
different
now it prints
When given the input:
(List(List(Bool))) => [[true; false; false]]
the test failed:
[[true; false; false]] != [[false; false; true]]
Thanks! (Apologies for the delay in merging)
I would like to test a library that provides type-directed combinators (
'a encoding
) for serialization and deserialization. The natural way to crowbar this is to build a generator of pairs at the type ('a gen * 'a encoding
), and then test that any generated value will pass serialization->deserialization unchanged.Dealing with such nested generators requires monadic structure, either with a
join : 'a gen gen -> 'a gen
function or a more traditionalbind
, which are currently absent from the Crowbar interface. This PR adds those two functions, plus an example demonstrating this use-case (testing type-directed combinators for serialization) in a toy case.Remarks:
bind
andreturn
when they should have usedmap
instead (which gives more freedom to the implementor by making it possible to statically explore the generator's shape). I could try to write a comment to encourage people not to use these new functions unless they are dealing with nested generators, or I could removebind
and expose onlyjoin
, which is more explicit. (I started with onlyjoin
, but it turns out that my example is nicer withbind
.)size
parameter in thegenerate
function: a comment explaining what it is doing at the top would have been nice (right now there is a comment in the middle of the function that looks (understandably) confused itself). I just usesize - 1
in my two recursive calls, which ensures termination but may not be the right thing either. (Maybe(size / 2)
in both cases?) Explaining reasons for choosing something could be nice. (For example, if there are several recursive calls, can we recommend that the sum of their parameters equal the initial size value?).(%a) => %a
(first the generated generator, then the generated value).minimize_depths
function by considering thatJoin
should be filtered out wheneverMap
is filtered out. One could even consider adding a fourth filtering category where Map is allowed, but Join is not.