braibant / articheck

14 stars 2 forks source link

Discuss the relation with Quickcheck #7

Closed braibant closed 10 years ago

braibant commented 10 years ago

Comment on the fact that what we generate is more or less a term of the AST of well-typed applications, which boils down to generation of GADTs, which is hard.

braibant commented 10 years ago

More about this: the "boils down to GADT generation" part is more tricky than expected.

Consider the (dummy) signature:

obs: t -> int
f: t -> v
g: v -> v
h: v -> t
i : t

It should be represented as a GADT with one sub-type per return type of functions (i.e. one for v, one for t, one for int) and with one constructor per function that returns this type. In particular, the type for int has two constructors, one for obs, one for user-defined integers.

Generating a term of this GADT would require the user to write it first...

gasche commented 10 years ago

Thinking about it more, I think one important difference with Quickcheck is that it piggybacks on Haskell's type classes to perform proof search (type class elaboration is prolog-style proof search). You example above could be expressed with the following Arbitrary instances:

instance Arbitrary Int where random () = Random.int 1000.
instance Arbitrary (t -> v) where random () = f
instance Arbitrary (v -> v) where random () = g
instance Arbitrary (v -> t) where random () = h
instance Arbitrary t where random () = i

instance (Arbitrary (a -> b), Arbitrary a) => Arbitrary b
  where random () = (random () :: a -> b) (random () :: a)

The interesting point is that the last instance cannot be / should not used be in Haskell, because (reasonable) Haskell derived instances are good at producing elements at complex types from elements at simple types, but not conversely -- allowing this in general quickly leads to termination and ambiguity issues for instance search.

On the other hand, we are implementing our own term search, instead of relying on built-in notions of term search of our host language, which means more work, but also more freedom to do things that are unrealistic in the general case of term search for program inference.

braibant commented 10 years ago

Here is a quote from the seminal QuickCheck paper:

The major limitation of Quickcheck is that there is no
measurement of test coverage: it is up to the user to in-
vestigate the distribution of test data and decide whether
suffciently many tests have been run. Although we provide
ways to collect this information, we cannot compel the pro-
grammer to use them. A programmer who does not risks
gaining a false sense of security from a large number of in-
adequate tests. Perhaps we could define adequacy measures
just on the generated test data, and thus warn the user at 
least in this kind of situation