nick8325 / quickcheck

Automatic testing of Haskell programs.
Other
724 stars 121 forks source link

Extract Arbitrary to a dedicated library #209

Open nikita-volkov opened 6 years ago

nikita-volkov commented 6 years ago

Arbitrary is useful on its own outside of the scope of QuickCheck. However the fact that it comes bundled with a testing library makes package authors way less inclined to provide the instance in their types. As the result it both hurts the usability of QuickCheck and fails to solve other problems.

Lysxia commented 6 years ago

158 seems relevant.

nikita-volkov commented 6 years ago

While it is relevant, the issue has been closed with resolution that the deps of QuickCheck have been reduced. What I'm focusing on here is that Arbitrary is useful enough as an isolated abstraction and does not necessarily have to be used only for testing purposes.

nick8325 commented 6 years ago

What is it that currently stops such packages from depending on QuickCheck? Is it simply a cultural thing? AFAICS any arbitrary package would have the same dependencies as QuickCheck, and QuickCheck itself only takes a few seconds to compile.

Do you have any particular uses of Arbitrary in mind?

leftaroundabout commented 6 years ago

@nick8325 it's true that QuickCheck has very modest dependencies, but I still wouldn't call it a really minimalistic package. It may be a bit silly to not want to depend on it, OTOH I'd say build times do matter and it makes sense to avoid dependencies that aren't really needed even if it only saves a few seconds.

An arbitrary package that basically only defines the Arbitrary class and the very most basic instances (Int, Integer, Double, Bool, [], Maybe, Char I'd say) would for sure be much lighter yet, and it would depend only on tf-random.

phadej commented 6 years ago

It saves you a second, but adds hours of work for @nick8325 (EDIT managing two packages is more work, note: arbitrary would need to package Gen and have dependency on tf-random). Is it fair trade-off?

nikita-volkov commented 6 years ago

@phadej Yes, evolving packages takes time. Are you saying we should stop doing that?

@nick8325 If it's the lack of time that stops us from implementing it, can we at least confirm that if someone decides to fork "QuickCheck", turning it into "arbitrary", and provides a PR to "QuickCheck" that migrates it to that library, that such PR will get merged? I'm asking because it might as well be me, if I find the time.

phadej commented 6 years ago

I personally don't see a benefit of splitting QuickCheck even further.

As a maintainer of these I have no problem providing instances there (given it's already heavy on dependencies already).

I'm not a maintainer of QuickCheck so take this with a grain of salt: @leftaroundabout or @nikita-volkov or anyone else are free to make a stack or cabal new build project to demonstrate that new arbitrary is really noticeably faster to compile than current QuickCheck, and that arbitrary + QuickCheck isn't significantly slower to compile either.

For the refresher: this is the dependency graph of QuickCheck on GHC-8.4.1, without boot libs (bundled with GHC):

deps

% time cabal new-build
Resolving dependencies...
Build profile: -w ghc-8.2.2 -O1
In order, the following will be built (use -v for more details):
 - QuickCheck-2.11.3 (lib) (first run)
Configuring library for QuickCheck-2.11.3..
Preprocessing library for QuickCheck-2.11.3..
Building library for QuickCheck-2.11.3..
...
cabal new-build  25,78s user 1,14s system 97% cpu 27,470 total

I don't have per-module stats, but Test.QuickCheck.Arbitrary is one of the slowest modules to compile.

nikita-volkov commented 6 years ago

You're missing my point:

Arbitrary is useful on its own outside of the scope of QuickCheck. However the fact that it comes bundled with a testing library makes package authors way less inclined to provide the instance in their types. As the result it both hurts the usability of QuickCheck and fails to solve other problems.

Packages are meant to have a purpose (or a subject). The subject of "QuickCheck" is property-testing. The subject of "arbitrary" would be a general interface for random values. IOW, not only are the subjects different, but the latter is also more general.

If we were to base the decision of package contents based on the build times only, then your argument would also be just as applicable to say having HashSet be distributed as part of "QuickCheck" as well.

phadej commented 6 years ago

What's is the scope outside of QuickCheck = property based testing?

Generating random values? We have http://hackage.haskell.org/package/random-1.1/docs/System-Random.html#t:Random (you may argue that that Random class is not the best possible, but that debate doesn't belong here).

What's is the intended use for Arbitrary, which isn't testing?

leftaroundabout commented 6 years ago

One application I could see outside testing is showing examples of how a function behaves, for automated documentation generation etc..

joshuatshaffer commented 6 years ago

I do not think that it is wise to extract Arbitrary it self into a library. Arbitrary is concerned with both random generation and shrinking. Shrinking does not seem like it would be useful outside of testing. However, it is a good idea to extract Gen into a dedicated library. I think this is what @nikita-volkov meant.

Having a powerful random generator library that also happens to be compatible with QuickCheck could be very valuable. This would allow developers to use the same EDSL to write both general purpose random generators and QuickCheck test case generators. This would be helpful to users and help grow QuickCheck's user base.

For example: Bob is writing a video game with a randomly generated map. Bob uses a library to make a random map generator. After a while Bob decides that he needs to test his code.

prop_theGoalIsReachable =
    forAll mapGen theGoalIsReachable

If the random map generator Bob made happened to be the same Gen that QuickCheck uses it would be fairly easy to write this property. All that is needed is to define the function theGoalIsReachable :: GameMap -> Bool. This would encourage Bob to use QuickCheck and it would make his job a lot easier.

On the other hand, if QuickCheck's Gen was not in a separate library, no one would have the foresight to use it. At first Bob was only writing a random generator. Why would he use a testing library for that? Now that Bob is testing he must find a way of wrapping the map generator in a QuickCheck generator, which could be a royal pain.

Soupstraw commented 3 years ago

I also really like the idea of separating out the Gen class. What I really don't like about the Random class in the base library is that the user has to carry around the generator state manually. Here is the signature of the base library random function:

random :: RandomGen g => g -> (a, g) 

I really like how Gen has generalized this to a monad. In addition to that, Gen comes with some really neat combinators.

leftaroundabout commented 3 years ago

@Soupstraw that's not really an argument, because there are other dedicated libraries that offer random monads. Check out random-fu.

brandonchinn178 commented 3 years ago

Continuing off @joshuatshaffer's comment, I think breaking out Gen and related helpers (e.g. suchThat) would be an excellent addition to the Haskell ecosystem as a whole. IMO Gen has more powerful primitives than random and it would be great to expose Gen as a first-class way of generating random things.

It would also be nice to break out Arbitrary into this separate package as well, because then you get a whole lot of generators/combinators for free (you want a random list of even integers? arbitrary 'suchThat' ((== 0) . ('mod' 2))). I think the comment about shrinking not being relevant outside of testing is important, though, to which I'd suggest breaking up Arbitrary(arbitrary, shrink) into Arbitrary(arbitrary) (to break out with Gen) and Shrinkable(shrink) (to keep in QuickCheck) with Arbitrary as a superclass of Shrinkable.

leftaroundabout commented 3 years ago

@brandonchinn178 I don't think it's a good idea to use arbitrary `suchThat` ... for generating random values in general code. This is inefficient and you don't really have any control about what distribution the random numbers come in. Fine for testing, but for most other applications I could think of this sounds problematic. Again believe you should rather consider a more powerful random package like random-fu

brandonchinn178 commented 3 years ago

The arbitrary `suchThat` ... snippet was just an example, it's not relevant to my core argument. In @joshuatshaffer's genMap example, Bob could implement genMap with all of the powerful combinators that use Gen/Arbitrary.