vincent-hugot / qtest

Inline (Unit) Tests for OCaml
GNU General Public License v3.0
67 stars 8 forks source link

Better quickcheck #19

Closed c-cube closed 9 years ago

c-cube commented 9 years ago

First draft of how to improve the quickcheck part. In particular:

Other things I want to do:

Things I would like to do but probably won't:

c-cube commented 9 years ago

This is a draft for now, I can rebase it or add stuff before merging if you consider merging it.

c-cube commented 9 years ago

Update: -verbose is implemented, but allowing the user to set the random seed doesn't look that easy.

vincent-hugot commented 9 years ago

I do consider merging it.

I am however very swamped right now -- too bad it's not during the summer, I'd have had time -- so I don't have much time to do any meaningful quality control.

Given I'm not active I'd be very glad to merge your improvements without reservation (seeing you as the active maintainer for qtest) -- so long, of course, as it doen't turn qtest into something alien and unrecognisable (that's what forks are for). I won't have time to do more than taking a very cursory look at the code, so it's on you not to break things. My main concern is remaining useful and hassle-free for Batteries.

I'll merge without any second thought anything that:

The stuff you did with ==> breaks backward compat., but I like it. I think => is unused, so perhaps you could use => instead. Then they can be exchanged later, when the minor is incremented. (Or the usual implication discarded, because it's not terribly useful -- not sure why I used it in the first place.)

Has anything else changed in the user interface that change the semantics of existing tests ? If not, with the corresponding doc and the above change, that looks eminently mergeable.

(And thanks for the good work !).

c-cube commented 9 years ago

Excellent. I'm not done yet but I'm happy you are willing to merge non-alienating changes. I will try to get in touch with active batteries maintainers (@gasche for instance) to see how the changes affect them; maybe they will also have something to say about what they would like and what possibly broken features (==>, hand-written generators) they might use.

I really want not to fork, that would be of no use, so let's reach an agreement! :-)

vincent-hugot commented 9 years ago

No problem; I'm very happy to see someone breathing new life in qtest.

gasche commented 9 years ago

To my knowledge Batteries does not rely much on advanced quickcheck feature (we do have about a hundred random tests, but they most just have a generator on the left and a predicate on the right). There is exactly one use of ==>, to perform a case distinction:

(*$Q mid
  (Q.pair Q.int Q.int) (fun (a,b) -> \
    let m = mid a b in \
    a <= b ==> (a <= m && m <= b && abs ((m-a) - (b-m)) <= 1) && \
    b < a ==> (b <= m && m <= a && abs ((m-b) - (a-m)) <= 1))
*)

and this could easily be rewritten in another way.

I have always thought, however, that "inserting blobs of code for testing purposes" and "being a good {unit,random} test library" were separate purposes that could easily become separate small projects used together (just as we use ounit for the nice-report-generation-whatever part). In fact when I wrote random-generator I also had in mind that a QuickCheck library could be build on top of it, with a nice separation of concepts between "generating random structures" and "orchestrating a reasonably safe coverage from a random generator".

So my two cents would be: it may be time to have quickcheck.{ml,mli} become its own separate projects, that people that don't particularly care about extraction from comments could rally around and use. I wouldn't mind switching the Batteries random tests to that if it can provide buy-in and motivation -- and if it integrates nicely with iTeML, of course.

c-cube commented 9 years ago

The real advantage of how qtest does things, imho, is the automatic pairing of generators and printers. By just writing Q.(list (pair small_int bool)), you combine both the random generation and the printer. Without typeclasses, otherwise, you would need to repeat the structure of the types 2, 3 or more times (there is the small function for choosing small counter-examples, too — I have been combining this with the generators in this branch).

I think random-generator is a good thing to have, and indeed it is compatible with the new design of Quickcheck (see Quickcheck.make to convert a 'a generator into a 'a arbitrary object); however, the point of my new design is that randomly generating values is not sufficient for realistic random testing. qtest acknowledges that by pairing printers and generators, and I want to go further by adding shrinking and size functions.

Therefore, I propose that we keep qtest as is (with its own basic combinators for most cases) but still allowing complicated generators to be written independently (using random-generator for instance) and then lifted to 'a arbitrary values.

c-cube commented 9 years ago

Another point about splitting the Quickcheck module from qtest: there are already plenty of libraries on opam that provide random testing (qcheck, quickcheck, kaputt, as far as I know). Here, the whole benefit and specificity of qtest, imho, is that tests are very easy to write because they are in comments next to the tested code. So if anything, it would be better to remove Quickcheck from qtest and use an already separate random testing library instead. But I don't advocate it, it's better to interoperate through the Random.State.t -> 'a type, which can be built different ways, including the case of random-generator.

gasche commented 9 years ago

I think a good move would be to have a good printer-generator library, and then build a generate-and-print layer on top of the intersection of both libraries (which is not total, as generating is covariant and printing contravariant, so map will need coercions both ways).

gasche commented 9 years ago

Seen another way, what random-gen provides is a series of instance declarations for certain "random generation" type classes. We could/should have a library providing instances for "printing" type-classes, and surely this should be a separate classes. The fact that we have to manually define that pair those instances together is unfortunate, but it comes from the lack of implicit resolution and is orthogonal to the fact of actually implementing the domain-specific operations.

Also, the availability of structural types for asynchronous cooperation is not an excuse for abandoning the UNIX model, separation of concerns.

c-cube commented 9 years ago

In principle I agree printing and random generation are separate, but here simplicity is important. If I need to write (list (pair int int)) 3 times to have generation, printing, and, say, shrinking, I will use qtest a lot less because of the inconvenience.

If some day OCaml gets a typeclass-like mechanism then I will be happy to propose the same changes as you. Until then I'd rather have something lightweight to use.

gasche commented 9 years ago

Of course I'm not suggesting that the user write it three times. The redundancy can be factored in a library. It seems that we're talking past each other, and in any case this had gone off-topic. Yes, do improve iTeML, and yes I'm ready to update Batteries random-tests if need be.

c-cube commented 9 years ago

Indeed, let's do that :-)

Side note for @vincent-hugot : the last test, about even and odd in corner cases, takes forever to run because it calls the recursive functions even and odd on max_int and max_int + 3.

c-cube commented 9 years ago

@vincent-hugot I'm thinking of rewriting the documentation in ocamldoc, with the following benefits:

However it looks like it cannot generate a table of contents, so I'm not sure it's a good idea. Another thing I would like to do is the Quickcheck equivalent of (*$R (i.e., one test in the whole comment, without trailing '\'); this might be useful for Batteries too when random tests get heavier.

vincent-hugot commented 9 years ago

I'm not too hot for rewriting the doc from one format to another. I might want it in PDF also one day. LaTeX is fine.

, one test in the whole comment

Was going to comment, but I see it's already done in a sensible way. Goody.

Have not read the rest of discussion yet; busy.

gasche commented 9 years ago

Note that you can ask ocamldoc to generate LaTeX files from .mli. Have a quicheck.odocl file with just Quicheck, and then:

ocamlbuild -docflag sepfiles quicheck.docdir/doc.tex

will generate in _build/quickcheck.docdir an ocamldoc.sty file with some LaTeX styles, and a Quicheck.tex file ready for inclusion.

It is also possible to request HTML output from ocamldoc and integrate them with the hevea output, this is how the OCaml Manual is built.

c-cube commented 9 years ago

It is not committed, since you don't want it, but I tried and converted the documentation into asciidoc (a markup language) as an experiment.

The source is here and the result is here. Note that github knows how to display the file properly.

Back to the patch itself: I'm almost done, I think — shrinking, raw random tests, cleanup of Quickcheck, etc. I might still try to display the random seed to be able to define the seed to repeat a test.

c-cube commented 9 years ago

Speaking of shrinking: currently if both shrink and small functions are provided (in the 'a arbitrary record), then I combine the normal behavior (generate count items and choose, among the counter-examples, the smallest one w.r.t. small) with shrinking (each counterexample is shrinked before any comparison with other counterexamples using small). I think it has the potential of finding small counterexamples in many cases, but it might be slow. What is your opinion on this?

edit: I can add a max_fail : int optional parameter in case there are many counter-examples and they take time to shrink.

c-cube commented 9 years ago

It's becoming easier to define one's own generators. For instance, in https://github.com/c-cube/ocaml-containers/blob/98511a5b8d92f9b4d8ff882654a48f826a51c412/src/sexp/CCSexpM.ml#L335 I define generation, printing and shrinking of S-expressions to check that Sexp.to_string s |> Sexp.of_string = Ok s`. Shrinking works in practice in this case :-)

vincent-hugot commented 9 years ago

I tried and converted the documentation into asciidoc (a markup language) as an experiment.

Nice. A net gain is the free-of-charge syntactic colouring. The point about it being

simpler to compile and modify for a regular OCaml user

is also a good one. Especially since Github allows users to edit files directly and propose changes, wiki-like. Very convenient. It would be good to have everything in one place...

The only thing really missing from the asciidoc on github is the TOC... In the worst case it could be constructed manually (or through a script) in adoc, rather than generating and hosting HTML separately. At some point the page may become biiig, but that's not a showstopper.

Overall, I withdraw my objection regarding the migration of the documentation from LaTeX. If you're willing, go for it.

You'd have to merge with the contents of the current README, of course adding yourself as contributor and updating the 'history of the project' section; increment version from 2.0 to 2.1 in the title (and wherever else it may appear); also remove "VH and the Batteries team" as author (always thought that sounded like a band's name).

c-cube commented 9 years ago

Ok, let's go for asciidoc. There is apparently a tweak for displaying the toc on github.

I can edit the readme and documentation right now.

edit: https://github.com/c-cube/iTeML/blob/af67006dd37ff4fc256589743763aeb731764480/qtest/doc/qtest.adoc the documentation rendered by github, with Toc!

c-cube commented 9 years ago

I think this branch is more ready to be merged now. I've migrated many of my (random) tests to this branch in containers and it works well, I can define my own arbitrary instances manually in some cases without too much hassle.

vincent-hugot commented 9 years ago

I was thinking of merging the readme and the doc, rather than having them remain separate. Maybe I'll do that later. (after the merge)

I think this branch is more ready to be merged now.

Okay. I'll do some tests on my machine before merging -- I should probably have time this WE at the latest. If my computer doesn't explode, I'll merge.

c-cube commented 9 years ago

I can merge the doc and the readme, no problem. Should I also remove the tex documentation?

vincent-hugot commented 9 years ago

Nah, I'll do the last adjustments to the doc and remove obsolete files after the merge.

c-cube commented 9 years ago

Last item on my todo list was this: display the random seed to be able to replay failing tests.

vincent-hugot commented 9 years ago
./testfoo.sh
                                                                                                                                                                          ~/Desktop/iTeML/qtest/Σ 19:22 vhugot@Shiva-linux c-cube-better_quickcheck:♜⚡♦ ⌘
Target file: `footest.ml'. Extraction : `foo.ml' 
Warning: likely qtest syntax error: `(* $T bar' at foo.ml:122. Done.
+ ocamlfind ocamlc -c -warn-error +26 -package oUnit,QTest2Lib -o footest.cmo footest.ml
File "foo.ml", line 165, characters 11-15:
Error: Unbound value lift
Hint: Did you mean list?

It's possible I've misconfigured something on my machine (I've just reinstalled opam etc); what is the expected output on your system ?

c-cube commented 9 years ago

Ok, rebase done. The tests should compile, but still fail (or rather fail to terminate) on the small_int_corners case (recursion of depth 2^62 to compute even/odd...).

c-cube commented 9 years ago

I can do an interactive rebase if you wish to obtain a cleaner history.

vincent-hugot commented 9 years ago

interactive rebase

Thanks, but there's no need.

vincent-hugot commented 9 years ago
vincent-hugot commented 9 years ago

and the counter example is no longer displayed by default ?

Er, sorry, strike that, it does not display it on master either.

It's been so long I don't even remember what the output was supposed to look like before...

c-cube commented 9 years ago

Ok, I will update the doc this evening.

vincent-hugot commented 9 years ago

Thanks. And no hurry, I'll probably not make a second pass until tomorrow evening.

c-cube commented 9 years ago

Counter-examples are printed, for me, by default, btw.

vincent-hugot commented 9 years ago

Indeed. Nevermind that; I misremembered the semantics of simple tests and wanted to see an "expected: X but got: Y". This bug was purely in my brain.

c-cube commented 9 years ago

I will document new flags (-verbose, -list, -seed) in the doc, but they are part of the runtime so qtest -help should not be about them. On the other hand, footest.native -help prints runtime flags. I will also document shrinking.

vincent-hugot commented 9 years ago

Yes; the existence of runtime options, at least, should be documented. It's a nice new feature.

The convention in qtest proper is (what I believe to be) the GNU standard of --long-option; it would be best if the runtime followed the same, so -v | --verbose, etc, vs single-dash.

vincent-hugot commented 9 years ago
(*$Q even & ~count:max_int
  (Q.small_int_corners ()) (fun n-> odd (abs n+3) = even (abs n))
*)

will only test 300 cases. Setting max_gen bypasses that, but it's undocumented. This is a significant behaviour change for random tests. To me, count should remain the number of randomly generated values for which the test is actually executed in the end. Other parameters might control how hard the generator works to generate each such value (eg. how many candidates it generates to find one that satisfies some precondition). At least that's how I intuitively expected things to work. Currently it seems that it works a bit differently.

Could you either tweak things so that count behaves that way, or explain why you prefer the current behaviour ?

c-cube commented 9 years ago

if ==> was undocumented before, then the change is retrocompatible (without ==>, count = max_gen anyway). Otherwise, ok, I will document both the combinator and the option max_gen.

To me count means intuitively "how many tests do I want", and max_gen is "how much random generated terms am I ready to pay to get those tests"

vincent-hugot commented 9 years ago

I'd rather have max_gen be a multiple of count by some user-settable constant, but that's borderline bike-shedding and I can always tweak that later. The same goes for the remaining nitpicks I have for the doc.

I'll merge now. At some point I'll issue a v2.1 OPAM package.

Thanks again for those very neat additions. :+1:

c-cube commented 9 years ago

Yay! \o/ Thanks for your time! :-)