generation of functions

c-cube commented 7 years ago

read more about JST's solution, and the original quickcheck paper
something along: to generate random int -> foo, generate a random decision tree on integers (based on intervals and, say, special cases like 0), then generate a random foo in each leaf

c-cube commented 7 years ago

cc @jmid might be of interest to you, too

jmid commented 7 years ago

Sorry for being a bit late with input. I looked a bit into this topic a year ago. I know of two relevant previous works on this topic. The first is the following paper Koen Claessen Shrinking and showing functions: (functional pearl) Haskell'12 http://dl.acm.org/citation.cfm?id=2364516

Koen's talk is available here: https://www.youtube.com/watch?v=CH8UQJiv9Q4

A second one is a talk by Ulf Norell on 'Generating Random Functions'. I know of no accompanying paper but there is a video presentation here: https://vimeo.com/143848099

c-cube commented 7 years ago

There is also the approach used by JST: https://blogs.janestreet.com/quickcheck-for-core/ (based, apparently, on building finite decision trees on the parameters). It looks quite nice to me.

c-cube commented 7 years ago

need some tests now.

c-cube commented 7 years ago

Hey @jmid I know your course has probably ended a while ago, but would you per chance have a bit of time to review this? I'd be glad to have your opinion on the API for functions…

jmid commented 7 years ago

Very cool contribution - thanks for this! I'm hard pressed to make the coming SAS paper deadline (8 days from now). I'll be happy to take a closer look after that. If you are still looking for example tests Ulf Norell has a cool and cute example of inferring game strategies (as functions) for the game of 'Nim' in the above linked video.

jmid commented 7 years ago

Ok, I've given the function generators a spin. I've uploaded the outcome here: https://github.com/jmid/qcheck-fun

If you have a look at the outcome you will probably agree with these observations:

shrinking does not work as well over functions (counterexamples are even greater with int generator)
the printed functions are partial (Claessen in Haskell and Norell's eqcfun in Erlang prints them with a final catch all case)
can one express Norell's prop_losing as a recursive invocation of QCheck_runner.run_tests?

This last one is a broader question: can QCheck safely invoke itself recursively? In this setting it is formulated as: no strategy from a pile of 19 matches in the game of Nim is winning ~ "for all strategies (functions), there exists another strategy (function, which we can find by an inner QCheck run), such that the former loses"

c-cube commented 7 years ago

Ah, interesting, thank you! However, look at 2085ae5c1b9a97264a6cdce193afc892750ed3dd (not released yet) which addresses the current issue. Hopefully the benchmarks you list can be improved by using the new API. Do you think it's possible? The combinators such as fun1 will be renamed…

Note: what's prop_losing? It doesn't seem to appear in the repo…

c-cube commented 7 years ago

Also, I'm not sure what qcheck calling itself would involve, sounds dangerous… Maybe a special combinator like check : 'a arbitrary -> ('a -> bool) -> bool could be used, but not sure how that would behave. Or, perhaps, a exists : ?tries:int -> ('a -> bool) -> 'a to be used inside a property.

c-cube commented 7 years ago

Actually, could you open another issue for recursive invocation? It seems what you want is actually qcheck-based synthesis (which can fail in its own ways), so it's clearly non trivial and deserves its own issue.

c-cube commented 7 years ago

I need to take a look at the Claessen's paper you linked. The API I added (inspired from JST's quickcheck) is quite sophisticated to use, so maybe something still based on a finite mapping a -> b (using a map or table underneath + a default value) would be better overall for the user? THe question is hwo to shrink, but I suppose it's related to Claessen's paper.

jmid commented 7 years ago

Ah, sorry for messing up: I mistakingly thought I was already using your latest additions. I will check it out and adjust the examples (I don't expect big changes based on the API of Observables.

prop_losing is defined (in Erlang) on Norell's slides, 15min into this video https://vimeo.com/143848099 In hinsight that comment is only understandable to someone (read: Jan) who has just re-watched Ulf's talk :-)

c-cube commented 7 years ago

Ok, ping me when you have an update! :-)

jmid commented 7 years ago

Hm. I checked out and built the latest branch. I can easily port the first two examples, by prefixing with Observable. in the generators and Fn.apply in the properties. However in the third example something seems to hang in the top-level of that build when I load:

(* An example (false) property *)
let prop_map_filter =
  let int_gen = int (* small_nat *) in 
  Test.make ~name:"map filter" ~count:1000
    (triple 
       (fun1 Observable.int int_gen)
       (fun1 Observable.int bool)
       (list int_gen))
     (fun (f,p,xs) -> List.map (Fn.apply f) (List.filter (Fn.apply p) xs)
       = List.filter (Fn.apply p) (List.map (Fn.apply f) xs))
 QCheck_runner.run_tests ~verbose:true [prop_map_filter];;

If I instead define int_gen as small_nat in line 3 it quickly returns with success :-o I admit I'm using the new Observable+Fn API here without understanding things under the hood.

c-cube commented 7 years ago

int_gen is probably far too big?

Does it shring properly now? I'm not totally satisfied with the API, it's more cumbersome to use; of course it's hard to provide ('a -> 'b) arbitrary with shrinking given that functions are opaque…

jmid commented 7 years ago

Before opening a new issue on recursive invocation, I would like to understand the API that Norell is using for his second Nim example. As I understand it, he defines a property-based test prop_winning which is parameterized by a number and a strategy. He then uses prop_winning inside is_winning (also parameterized) to approximate whether something is a winning strategy (survives 500 tests). Finally he uses is_winning inside prop_losing to test whether all strategies are losing for a given number - this also quantifies over strategies. This appears reminiscent of nested quantifiers in my mind (but I am no logician), hence the idea of recursive invocation.

All of these are defined approximately 15min into https://vimeo.com/143848099

I cannot find any API info on eqc:no_recheck. The counterexample construct returns the failed counterexample as a value according to this API doc: http://quviq.com/documentation/eqc/eqc.html#counterexample-1

@jlouis can you enlighten us a bit on what's going on?

jmid commented 7 years ago

@c-cube The test of this false property succeeds with int_gen defined as small_nat, so I never get a counterexample to observe the shrinking behaviour :-o

c-cube commented 7 years ago

Oh, annoying :/ I'm skimming through Claessen's paper right now, I suppose an approach based on lazily mapping each input to a random value is better in the end. I'll try another API soon. More advice is always welcome.

jmid commented 7 years ago

If I lower the test count to 2(!) with int_gen bound to int, I get back the command prompt when both tests succeed. If one fails it actually doesn't hang. One run returned after 82 seconds:

# #use "fun.ml";;
val prop_map_filter : QCheck.Test.t = QCheck.Test.Test <abstr>
 generated; error;  fail; pass / total -     time -- test name
[✗] (   1)    0 ;    1 ;    0 /    2 --    82.7s -- map filter

--- Failure --------------------------------------------------------------------

Test map filter failed (24008238 shrink steps):

((if (? <= -151763671) 0 0), (if (? <= 122329608) false true), [122329608])
================================================================================
failure (1 tests failed, 0 tests errored, ran 1 tests)
- : int = 1

24008238 shrink steps! No wonder it took over a minute :-o Are there any low hanging fruits for improving the shrinking approach?

This situation echoes another experience of mine. For a program generator I wrote, I ended up instrumenting it to print out . for each successful property and x for failure. That way a string of ......... reveals that (slow) progress is still being made, while a mixed string of x...x.....x....x says that the framework is progressively shrinking. With no printing (as for the shrinking phase above) it is hard to know as a user, whether you've hit an infinite loop or whether things are progressing under the hood.

Finally, I still don't understand why the version with small_nat succeeds.

jlouis commented 7 years ago

Claessen's work was the original stuff for this indeed. Ulf's work works in a dynamically typed language, but perhaps the Haskell approach is closer to OCaml's solution. It depends a bit on how you can get at the types I think.

As for the example: we are playing Nim. The N parameter is how many tokens are left in the nim game. A strategy is a function from (N : int) -> (M : move), where the N is the number of tokens left and M is the move you want to make. prop_winning(N, Strat) defines a property which holds true if Strat is indeed a winning strategy for a game with N tokens at the start. We are searching for a counterexample to that particular winning strategy. ?WHENFAIL that is, when the property fails, we print out the function we just lost to.

is_winning(N, Strat) uses the same N and Strat. It tries 500 times to find a counterexample to prop_winning and if it can't we deem that we have a winning strategy. The counterexample/1 just returns a failure or true, so it works as you expect. no_recheck is an (undocumented) option. If memory serves, EQC rechecks a failed property by re-running the counterexample. This avoids that, so you avoid printing out some loses to more than once.

finally, prop_losing(N) claims that at N tokens left, no strategy is the winning one. That is, there is always another strategy which can beat it. That is, in the order of strategies, no strategy is greater than everyone else.

Indeed, Ulf is just using the QuickCheck system inside the QuickCheck system. It isn't too different from having an inner induction proof running while processing an outer induction proof. I'm not sure you need to "tie the knot" on the two things at the type level since the inner use is separable from the outer (I may be wrong however, have not given it much thought and the addition of types might prove problematic as the type becomes recursive, and thus requires some iso-recursive handling).

c-cube commented 7 years ago

Ok, so I think I get the gist of Claessen's paper. I'm not very happy with all the ideas in it, but nevertheless, here is a new design draft for (simpler) generation of functions:

The main idea is to define a function a -> b as a pair ('a, 'b) partial_fun * 'b, where the partial_fun is basically a set of cases (a Map or Hashtbl internally), + a default case. During the testing, we actually work with the map/hashtbl directly, building it as we go by generating a random value for every input the function is called on; then, we pick one of the values for the default case. For printing, we just print the table with a _ -> x for the default case. For shrinking, we remove a binding from the table by chance, or try to shrink the default value (?).

Practically, for the user, it should look like this:


module Observable : sig
  type 'a t  (* basically, a printer + eq + hash *)
  val int : int t
  val bool : bool t
  val make : ?eq:'a eq -> ?hash:'a hash -> 'a Print.t -> 'a t
end

(* for direct extraction of the function by pattern matching *)
type ('a, 'b) fun_ = Fun : ('a, 'b) secret_fun_state * ('a -> 'b) -> ('a, 'b) fun_

val apply : ('a, 'b) fun_ -> 'a -> 'b

val fun1 : 'a Observable.t -> 'b arbitrary -> ('a, 'b) fun_ arbitrary

c-cube commented 7 years ago

@jmid that simple property taking 24M steps of shrinking hints that the current approach is bad. We definitely need something that looks at the input it is given instead of guessing. @jlouis thanks for the explanations. This way of using quickcheck is quite impressive, and I would rather have expected it in a (lazy) smallcheck setting. I'll think about a val find_example : ?count:int -> ('a -> bool) -> 'a arbitrary -> 'a option to be used inside the properties.

jlouis commented 7 years ago

What I think is happening, statistically, is that we are essentially invoking the law of large numbers (LNN), or in french: Loi des grands nombres. As the number of test cases grow large, we tend to hit every possible case. And computers are getting really good at generating insane amounts of numbers quickly. This somewhat challenges our minds as we would tend to expect you need exhaustive testing in order to be sure you have hit everything. But in practice, since you are often rerunning yuur test cases over and over, exhaustion is of less importance as every possible path is likely to be run at some point.

The obvious counterexample is something like

match sum [x; y; z; a; b; c] with
| 37 -> true
| _ -> false

where x,y,c,a,b,c are integers. It is not very likely to be hit in randomized methods. You need either smallcheck or concolic testing (where you say "dear SMT solver, you can't find a situation where x + y + z + a + b + c = 37!" and it then comes up with a counterexample to that claim).

On Tue, May 9, 2017 at 4:50 PM Simon Cruanes notifications@github.com wrote:

@jmid https://github.com/jmid that simple property taking 24M steps of shrinking hints that the current approach is bad. We definitely need something that looks at the input it is given instead of guessing. @jlouis https://github.com/jlouis thanks for the explanations. This way of using quickcheck is quite impressive, and I would rather have expected it in a (lazy) smallcheck setting. I'll think about a val find_example : ?count:int -> ('a -> bool) -> 'a arbitrary -> 'a option to be used inside the properties.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/c-cube/qcheck/issues/8#issuecomment-300189478, or mute the thread https://github.com/notifications/unsubscribe-auth/AAAWH-9nqLiYsoWztDK0HrwQQfZR_1loks5r4H0zgaJpZM4LVcrP .

c-cube commented 7 years ago

Indeed, qcheck has been very helpful for me so far. For counterexamples that are hard to find, I wrote https://github.com/c-cube/smbc/ last summer (for pure logic in proof assistants, though, it's not ready for OCaml at all).

jmid commented 7 years ago

@c-cube Sounds good, this also agrees with my understanding of Claessen's work. Koen's talk is available here: https://www.youtube.com/watch?v=CH8UQJiv9Q4 which explains/motivates the translation back and forth between the internal representations (if I recall correctly).

c-cube commented 7 years ago

I still don't really get the point of the translations…

jlouis commented 7 years ago

https://hackage.haskell.org/package/QuickCheck-2.9.2/docs/src/Test-QuickCheck-Function.html contains the Haskell code. It somewhat explains why you might want translations (called Map in the Haskell code). The purpose is to provide a way to embed already existing function generators into more general worlds to quickly ramp up the types for which your functions hold.

For example: Integral is the type class where you can do whole number division with remainders. Suppose we want a function, Integral a => a -> b whose domain is of the Integral a class without specifying what it is in more detail (since the type only requires an integral type). We can do that because we can Trans between Integer and Integral and thus use the integers embedded in the Integral class.

Likewise, because of the Show / Read instance pairs, if we have any serialization we can use this to embed such functions inside an instance (based on String deep down).

The idea is to get instances implicitly "for free". I'm not sure You'd need that in OCaml, where this kind of implicitness tend to be handled by functor applications in order to build/construct a type specimen matching the thing you want. I'm not too versed in qcheck to understand all the details of your implementation however.

c-cube commented 7 years ago

@jmid I just pushed a new version, based on finite tables (for partial functions, that can be shrinked and printed). The examples contain a few test cases now. It's basically:

let fun1 =
  QCheck.Test.make ~count:100
    ~name:"FAIL_pred_map_commute"
    QCheck.(triple
        (small_list small_int)
        (fun1 Observable.int int)
        (fun1 Observable.int bool))
    (fun (l,QCheck.Fun (_,f), QCheck.Fun (_,p)) ->
       List.filter p (List.map f l) = List.map f (List.filter p l))

let fun2 =
  QCheck.Test.make ~count:100
    ~name:"FAIL_fun2_pred_strings"
    QCheck.(fun1 Observable.string bool)
    (fun (QCheck.Fun (_,p)) ->
       not (p "some random string") || p "some other string")

It is also possible to build n-ary functions using @-> and @@->.

jmid commented 7 years ago

@c-cube Ok, this works nicely, thanks! I've just pushed the changes needed to the examples here: https://github.com/jmid/qcheck-fun

A few observations:

shrinking works much better now

some things could potentially be shrunk even further, e.g., the 7414 here:

 Test map filter failed (40 shrink steps):

 ({_ -> 0}, {7414 -> false; _ -> true}, [7414])

shrinking may occasionally take very long. In one particular run after 10 Ctrl-C and 20min I just got my command prompt back with this output:
```
Test fold_left fold_right failed (6190 shrink steps):
```

({ -> {93 -> 0; -> 93}}, 0, [0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0])



Looking at the code, it seems to me that `shrink_rep` in line 796 keeps a `Fun_append`, and thereby shrinks at the leaves. Is this correct? Is it possible to cut down such trees (and preferably put such a strategy first in the list of shrinking candidates) for a more effective overall shrinking strategy? If I understand the Haskell code correctly, its shrinking stategy attempts to replace sub-parts of the partial functions with `Nil` (the partial function defined nowhere) and collapse combinations of `Nil` into another `Nil`.

c-cube commented 7 years ago

Thanks for the feedback.

concerning the 7414 that could be shrunk, the problem is a kind of race condition between shrinking the list, and shrinking the function. If the function is shrunk first, it will become a singleton (as is the case here) and then, trying to reduce 7414 to 7413 will fail; we would probably need to shrink both the function's key and the list element simultaneously, and I have no idea how to do it.
if you really want to kill the process, use ctrl-z then kill %1 and fg, works every time (unlike ctrl-c) ;-)
the code of shrink_rep calls Poly_tbl.shrink, which will try two things (line 730). First, it will try removing an entry of the table (replacing a sub-tree with Nil); second, it will try to shrink an entry without removing it. The performance issues might also be related to the fact I use a hashtable with a lot of copies (suboptimal) instead of using a persistent structure. I could require a compare function instead and use a map, but such functions are usually more annoying to write…

If you have small test cases that exercise the library (such as the fold_left/fold_right one?), please consider contributing them in a PR! It's always nice to test the testing libraries ;-)

jmid commented 7 years ago

Ok, thanks. For large tables (and for "nested"/2-argument functions) like fold_left fold_right one-by-one entry shrinking sounds like a likely cause. I can see the Haskell implementation falls back on their list shrinking for their tables. Since QCheck already has an effective list shrinker, can't we do something similar?

You are more than welcome to take the tests and incorporate them in QCheck's own testsuite. That will be more effective than me figuring out where negative tests fit in and fighting git to make it happen... ;-)

c-cube commented 7 years ago

Maybe I can add https://github.com/jmid/qcheck-fun as a submodule…

The list shrinker would do the same, I believe. But give me a second, I'll try another strategy!

c-cube commented 7 years ago

Perhaps I should try to build a uncurried table, actually — that might work with GADTs. This way, the shrinking is done by removing one tuple, or by shrinking an entry.

jmid commented 7 years ago

Oh, I actually looked at QCheck's list shrinker before, and thought it did something more advanced, but it doesn't seem to:

# Iter.find (fun _ -> true) (Shrink.list [1;2;3;4;5]);;
- : int list option = Some [2; 3; 4; 5]

In comparison, e.g., the Haskell shrinker (defined as shrinkList here: https://hackage.haskell.org/package/QuickCheck-2.9.2/docs/src/Test-QuickCheck-Arbitrary.html ) will repeatedly try to remove a number of elements before proceeding to per-entry shrinking (but this topic is getting closer to a separate issue).

c-cube commented 7 years ago

Please open an issue for the better Shrink.list :-) I wrote the simplest version I could think of, but I suppose something based on divide-and-conquer (removing half the elements, or a quarter, etc.) would be faster in many cases…

jlouis commented 7 years ago

I've often used a free construction: A list is a binary tree which is appended together. It shrinks to a subtree, either the left or right one. By tracking where we glued together the list in the tree, we essentially have a free construction. But for a generic variant, I'd definitely go with something simpler. You can also go with a dropping variant which successively tries to drop more and more aggressively: if we can drop one element, lets try dropping 2,3,5,8,13,... or something such. The goal is to quickly remove noise, but a few initial probe attempts are not going to be a problem I think: Erlang is bytecode-interpreted and slow. But shrinking is still plenty fast. You have at least one order of magnitude of efficiency in a naive OCaml implementation.

On Wed, May 10, 2017 at 1:17 PM Simon Cruanes notifications@github.com wrote:

Please open an issue for the better Shrink.list :-) I wrote the simplest version I could think of, but I suppose something based on divide-and-conquer (removing half the elements, or a quarter, etc.) would be faster in many cases…

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/c-cube/qcheck/issues/8#issuecomment-300451691, or mute the thread https://github.com/notifications/unsubscribe-auth/AAAWH3msi5ASrmSAXMi_4Q7j4FjP4k38ks5r4Zy2gaJpZM4LVcrP .

c-cube commented 7 years ago

Thanks @jlouis . I will probably imitate the code from Haskell's quickcheck but using arrays (mapping from/to lists).

c-cube commented 7 years ago

@jmid other remark: perhaps using uncurried functions (with the pair, triple etc. generators) yield better shrinking? That would be an interesting test to motivate the move towards a single table indexed by tuples of arguments under the hood.

c-cube commented 7 years ago

It seems that the uncurried version of fold_left fold_right has no shrinking issue. I think it's worth checking if a flat table performs better.

jmid commented 7 years ago

Yes, I believe I wrote both versions (curried and uncurried) in the example tests (following Claessen) for the fold_left fold_right case at least.

c-cube commented 7 years ago

My bad, I didn't see it. Anyway it shrinks faster…

c-cube commented 7 years ago

@jmid I have updated functions again, it seems to shrink faster in some cases (but fold_left fold_right still sometimes takes time). Can you please review the API? I'm going to try a more aggressive shrinker in the meantime.

jmid commented 7 years ago

I tried rerunning the function tests with the newest version. My impression is the same as yours: sometimes the fold_left fold_right still take a long time.

Regarding the API, I was trying to wrap my head around the Tuple interface by formulating a version of the above test that would use it. It seems to me that only unit Tuple.obs and unit Tuple.t can be created as "base cases", meaning that any immediate application of (@::) and (@->) will be limited to types of the form ('a * unit) obs (of course more application can ramp this up, but the innermost seems to be limited to unit in the right component). In order to apply Tuple.observable in the above case I was aiming for an (int * int) obs though. Perhaps I am just misunderstanding the Tuple API.

c-cube commented 7 years ago

I wonder if the example is slow because it uses the list generator, which can produce very long lists. Because in a tuple, the first element (here, a function) is shrinked first, qcheck might try to shrink the (large) function first, fail; then shrink the list one step successfully; then try again with the shorter list. This can cause a lot of unsuccessful shrinking.

About tuples, yeah, the base case should be a pair, not an empty tuple. I'll fix it and come back to you.

edit: actually I remember, the current construction with unit is nice because GADTs are not ambiguous. It's a classic construction for heterogeneous lists (see the code of fun3 and the likes to see how it works).

c-cube commented 7 years ago

Also note that the user should generally use fun2, fun3, fun4; tuples are really exposed for completeness…

c-cube commented 7 years ago

@jmid if you don't have objections, I am going to consider this issue as solved soon :-)

jmid commented 7 years ago

You are welcome to close it. I just reran my tests with functions put last in the tuple. There were still occasional long runs but it generally seems faster than I've experienced previously.

In these situations (with functions or long lists of large integers) it would be welcome with some feedback to an interactive user, e.g., just printing Found counterexample, shrinking or something like that. I'll open a new issue with this request.

The advise of putting functions last in a tuple (shrunk in lexicographic order, left-to-right) may be valuable to other users, so it would be great to put it in the documentation.

c-cube / qcheck

generation of functions #8