Closed c-cube closed 9 years ago
This is a draft for now, I can rebase it or add stuff before merging if you consider merging it.
Update: -verbose
is implemented, but allowing the user to set the random seed doesn't look that easy.
I do consider merging it.
I am however very swamped right now -- too bad it's not during the summer, I'd have had time -- so I don't have much time to do any meaningful quality control.
Given I'm not active I'd be very glad to merge your improvements without reservation (seeing you as the active maintainer for qtest) -- so long, of course, as it doen't turn qtest into something alien and unrecognisable (that's what forks are for). I won't have time to do more than taking a very cursory look at the code, so it's on you not to break things. My main concern is remaining useful and hassle-free for Batteries.
I'll merge without any second thought anything that:
The stuff you did with ==> breaks backward compat., but I like it. I think => is unused, so perhaps you could use => instead. Then they can be exchanged later, when the minor is incremented. (Or the usual implication discarded, because it's not terribly useful -- not sure why I used it in the first place.)
Has anything else changed in the user interface that change the semantics of existing tests ? If not, with the corresponding doc and the above change, that looks eminently mergeable.
(And thanks for the good work !).
Excellent. I'm not done yet but I'm happy you are willing to merge non-alienating changes. I will try to get in touch with active batteries maintainers (@gasche for instance) to see how the changes affect them; maybe they will also have something to say about what they would like and what possibly broken features (==>
, hand-written generators) they might use.
I really want not to fork, that would be of no use, so let's reach an agreement! :-)
No problem; I'm very happy to see someone breathing new life in qtest.
To my knowledge Batteries does not rely much on advanced quickcheck feature (we do have about a hundred random tests, but they most just have a generator on the left and a predicate on the right). There is exactly one use of ==>
, to perform a case distinction:
(*$Q mid
(Q.pair Q.int Q.int) (fun (a,b) -> \
let m = mid a b in \
a <= b ==> (a <= m && m <= b && abs ((m-a) - (b-m)) <= 1) && \
b < a ==> (b <= m && m <= a && abs ((m-b) - (a-m)) <= 1))
*)
and this could easily be rewritten in another way.
I have always thought, however, that "inserting blobs of code for testing purposes" and "being a good {unit,random} test library" were separate purposes that could easily become separate small projects used together (just as we use ounit
for the nice-report-generation-whatever part). In fact when I wrote random-generator I also had in mind that a QuickCheck library could be build on top of it, with a nice separation of concepts between "generating random structures" and "orchestrating a reasonably safe coverage from a random generator".
So my two cents would be: it may be time to have quickcheck.{ml,mli}
become its own separate projects, that people that don't particularly care about extraction from comments could rally around and use. I wouldn't mind switching the Batteries random tests to that if it can provide buy-in and motivation -- and if it integrates nicely with iTeML, of course.
The real advantage of how qtest
does things, imho, is the automatic pairing of generators and printers. By just writing Q.(list (pair small_int bool))
, you combine both the random generation and the printer. Without typeclasses, otherwise, you would need to repeat the structure of the types 2, 3 or more times (there is the small
function for choosing small counter-examples, too — I have been combining this with the generators in this branch).
I think random-generator
is a good thing to have, and indeed it is compatible with the new design of Quickcheck
(see Quickcheck.make
to convert a 'a generator
into a 'a arbitrary
object); however, the point of my new design is that randomly generating values is not sufficient for realistic random testing. qtest
acknowledges that by pairing printers and generators, and I want to go further by adding shrinking and size functions.
Therefore, I propose that we keep qtest
as is (with its own basic combinators for most cases) but still allowing complicated generators to be written independently (using random-generator
for instance) and then lifted to 'a arbitrary
values.
Another point about splitting the Quickcheck
module from qtest
: there are already plenty of libraries on opam that provide random testing (qcheck, quickcheck, kaputt, as far as I know). Here, the whole benefit and specificity of qtest, imho, is that tests are very easy to write because they are in comments next to the tested code. So if anything, it would be better to remove Quickcheck
from qtest
and use an already separate random testing library instead. But I don't advocate it, it's better to interoperate through the Random.State.t -> 'a
type, which can be built different ways, including the case of random-generator
.
I think a good move would be to have a good printer-generator library, and then build a generate-and-print layer on top of the intersection of both libraries (which is not total, as generating is covariant and printing contravariant, so map
will need coercions both ways).
Seen another way, what random-gen
provides is a series of instance declarations for certain "random generation" type classes. We could/should have a library providing instances for "printing" type-classes, and surely this should be a separate classes. The fact that we have to manually define that pair those instances together is unfortunate, but it comes from the lack of implicit resolution and is orthogonal to the fact of actually implementing the domain-specific operations.
Also, the availability of structural types for asynchronous cooperation is not an excuse for abandoning the UNIX model, separation of concerns.
In principle I agree printing and random generation are separate, but here simplicity is important. If I need to write (list (pair int int))
3 times to have generation, printing, and, say, shrinking, I will use qtest
a lot less because of the inconvenience.
If some day OCaml gets a typeclass-like mechanism then I will be happy to propose the same changes as you. Until then I'd rather have something lightweight to use.
Of course I'm not suggesting that the user write it three times. The redundancy can be factored in a library. It seems that we're talking past each other, and in any case this had gone off-topic. Yes, do improve iTeML, and yes I'm ready to update Batteries random-tests if need be.
Indeed, let's do that :-)
Side note for @vincent-hugot : the last test, about even
and odd
in corner cases, takes forever to run because it calls the recursive functions even
and odd
on max_int
and max_int + 3
.
@vincent-hugot I'm thinking of rewriting the documentation in ocamldoc, with the following benefits:
Quickcheck
in particular)However it looks like it cannot generate a table of contents, so I'm not sure it's a good idea. Another thing I would like to do is the Quickcheck
equivalent of (*$R
(i.e., one test in the whole comment, without trailing '\'); this might be useful for Batteries too when random tests get heavier.
I'm not too hot for rewriting the doc from one format to another. I might want it in PDF also one day. LaTeX is fine.
, one test in the whole comment
Was going to comment, but I see it's already done in a sensible way. Goody.
Have not read the rest of discussion yet; busy.
Note that you can ask ocamldoc to generate LaTeX files from .mli
. Have a quicheck.odocl
file with just Quicheck
, and then:
ocamlbuild -docflag sepfiles quicheck.docdir/doc.tex
will generate in _build/quickcheck.docdir
an ocamldoc.sty
file with some LaTeX styles, and a Quicheck.tex
file ready for inclusion.
It is also possible to request HTML output from ocamldoc and integrate them with the hevea output, this is how the OCaml Manual is built.
It is not committed, since you don't want it, but I tried and converted the documentation into asciidoc (a markup language) as an experiment.
The source is here and the result is here. Note that github knows how to display the file properly.
Back to the patch itself: I'm almost done, I think — shrinking, raw random tests, cleanup of Quickcheck
, etc. I might still try to display the random seed to be able to define the seed to repeat a test.
Speaking of shrinking: currently if both shrink
and small
functions are provided (in the 'a arbitrary
record), then I combine the normal behavior (generate count
items and choose, among the counter-examples, the smallest one w.r.t. small
) with shrinking (each counterexample is shrinked before any comparison with other counterexamples using small
). I think it has the potential of finding small counterexamples in many cases, but it might be slow. What is your opinion on this?
edit: I can add a max_fail : int
optional parameter in case there are many counter-examples and they take time to shrink.
It's becoming easier to define one's own generators. For instance, in https://github.com/c-cube/ocaml-containers/blob/98511a5b8d92f9b4d8ff882654a48f826a51c412/src/sexp/CCSexpM.ml#L335 I define generation, printing and shrinking of S-expressions to check that Sexp.to_string s |> Sexp.of_string =
Ok s`. Shrinking works in practice in this case :-)
I tried and converted the documentation into asciidoc (a markup language) as an experiment.
Nice. A net gain is the free-of-charge syntactic colouring. The point about it being
simpler to compile and modify for a regular OCaml user
is also a good one. Especially since Github allows users to edit files directly and propose changes, wiki-like. Very convenient. It would be good to have everything in one place...
The only thing really missing from the asciidoc on github is the TOC... In the worst case it could be constructed manually (or through a script) in adoc, rather than generating and hosting HTML separately. At some point the page may become biiig, but that's not a showstopper.
Overall, I withdraw my objection regarding the migration of the documentation from LaTeX. If you're willing, go for it.
You'd have to merge with the contents of the current README, of course adding yourself as contributor and updating the 'history of the project' section; increment version from 2.0 to 2.1 in the title (and wherever else it may appear); also remove "VH and the Batteries team" as author (always thought that sounded like a band's name).
Ok, let's go for asciidoc. There is apparently a tweak for displaying the toc on github.
I can edit the readme and documentation right now.
edit: https://github.com/c-cube/iTeML/blob/af67006dd37ff4fc256589743763aeb731764480/qtest/doc/qtest.adoc the documentation rendered by github, with Toc!
I think this branch is more ready to be merged now. I've migrated many of my (random) tests to this branch in containers and it works well, I can define my own arbitrary instances manually in some cases without too much hassle.
I was thinking of merging the readme and the doc, rather than having them remain separate. Maybe I'll do that later. (after the merge)
I think this branch is more ready to be merged now.
Okay. I'll do some tests on my machine before merging -- I should probably have time this WE at the latest. If my computer doesn't explode, I'll merge.
I can merge the doc and the readme, no problem. Should I also remove the tex documentation?
Nah, I'll do the last adjustments to the doc and remove obsolete files after the merge.
Last item on my todo list was this: display the random seed to be able to replay failing tests.
./testfoo.sh
~/Desktop/iTeML/qtest/Σ 19:22 vhugot@Shiva-linux c-cube-better_quickcheck:♜⚡♦ ⌘
Target file: `footest.ml'. Extraction : `foo.ml'
Warning: likely qtest syntax error: `(* $T bar' at foo.ml:122. Done.
+ ocamlfind ocamlc -c -warn-error +26 -package oUnit,QTest2Lib -o footest.cmo footest.ml
File "foo.ml", line 165, characters 11-15:
Error: Unbound value lift
Hint: Did you mean list?
It's possible I've misconfigured something on my machine (I've just reinstalled opam etc); what is the expected output on your system ?
Ok, rebase done. The tests should compile, but still fail (or rather fail to terminate) on the small_int_corners
case (recursion of depth 2^62 to compute even/odd...).
I can do an interactive rebase if you wish to obtain a cleaner history.
interactive rebase
Thanks, but there's no need.
and the counter example is no longer displayed by default ?
Er, sorry, strike that, it does not display it on master either.
It's been so long I don't even remember what the output was supposed to look like before...
Ok, I will update the doc this evening.
Thanks. And no hurry, I'll probably not make a second pass until tomorrow evening.
Counter-examples are printed, for me, by default, btw.
Indeed. Nevermind that; I misremembered the semantics of simple tests and wanted to see an "expected: X but got: Y". This bug was purely in my brain.
I will document new flags (-verbose
, -list
, -seed
) in the doc, but they are part of the runtime so qtest -help
should not be about them. On the other hand, footest.native -help
prints runtime flags. I will also document shrinking.
Yes; the existence of runtime options, at least, should be documented. It's a nice new feature.
The convention in qtest proper is (what I believe to be) the GNU standard of --long-option; it would be best if the runtime followed the same, so -v | --verbose
, etc, vs single-dash.
count
and max_gen
interfere with one another:(*$Q even & ~count:max_int
(Q.small_int_corners ()) (fun n-> odd (abs n+3) = even (abs n))
*)
will only test 300 cases. Setting max_gen bypasses that, but it's undocumented. This is a significant behaviour change for random tests. To me, count should remain the number of randomly generated values for which the test is actually executed in the end. Other parameters might control how hard the generator works to generate each such value (eg. how many candidates it generates to find one that satisfies some precondition). At least that's how I intuitively expected things to work. Currently it seems that it works a bit differently.
Could you either tweak things so that count behaves that way, or explain why you prefer the current behaviour ?
if ==>
was undocumented before, then the change is retrocompatible (without ==>
, count = max_gen
anyway). Otherwise, ok, I will document both the combinator and the option max_gen
.
To me count
means intuitively "how many tests do I want", and max_gen
is "how much random generated terms am I ready to pay to get those tests"
I'd rather have max_gen
be a multiple of count
by some user-settable constant, but that's borderline bike-shedding and I can always tweak that later. The same goes for the remaining nitpicks I have for the doc.
I'll merge now. At some point I'll issue a v2.1 OPAM package.
Thanks again for those very neat additions. :+1:
Yay! \o/ Thanks for your time! :-)
First draft of how to improve the quickcheck part. In particular:
Gen
combinators)small
to the extended generator intsancelaws
andlaws_exn
if available==>
different in that a failed precondition will generate a new test case (up to a limit)Other things I want to do:
-verbose
mode that prints statistics on random tests (number of failed preconditions, etc.)Things I would like to do but probably won't: