typelevel / scalacheck

Property-based testing for Scala
http://www.scalacheck.org
BSD 3-Clause "New" or "Revised" License
1.94k stars 404 forks source link

trees in scalacheck #305

Open mosesn opened 7 years ago

mosesn commented 7 years ago

Trees are a pain to generate with scalacheck because scalacheck makes them too big. This means that folks who want to experiment with large trees have to build an ad-hoc depth-limiting function into their tree generators. Scalacheck should provide tooling so that we don't have to constantly reimplement this.

A possible solution (and perhaps to illustrate my complaint better):

sealed trait Tree
case object Leaf extends Tree
case class Branch(left: Tree, right: Tree) extends Tree

def tree2[T](maxDepth: Int, g: Gen[(T, T) => T], leaves: Gen[T]): Gen[T] = {
  def go(depth: Int): Gen[T] = {
    if (maxDepth) == 0 leaves
    else {
      Gen.oneOf(
        for {
          fn <- g
          next = go(maxDepth - 1)
          left <- next
          right <- next
        } yield fn(left, right),
        leaves
      )
    }
  }
  recurse(maxDepth)
}

tree2(50, Gen.const((left, right) => Branch(left, right)), Gen.const(Leaf))
alexarchambault commented 7 years ago

FYI, scalacheck-shapeless allows to (almost) explicitly limit the size of the generated trees, since its version 1.1.4, like

import org.scalacheck._, Shapeless._

sealed trait Tree
case object Leaf extends Tree
case class Branch(left: Tree, right: Tree) extends Tree

implicit val treeRecursive = derive.Recursive[Tree](Gen.const(Leaf)) // argument is the generator used (at the leaves) if the generated tree gets too big

The size from the Gen.Parameters used during the generation then makes the tree generation stop after some depth.

Currently, a small glitch in the case class generation makes this not put a hard limit on the depth of the generated trees, but it's about to be fixed. (https://github.com/alexarchambault/scalacheck-shapeless/pull/52 fixes that)

mosesn commented 7 years ago

Very cool! Would love to be able to do it without adding a dependency on shapeless though.

alexarchambault commented 7 years ago

@mosesn Just do like scalacheck-shapeless does: use Gen.sized / Gen.resize, making sure the Tree arbitrary makes the size decrease.

mosesn commented 7 years ago

@alexarchambault how do I make the size decrease?

xuwei-k commented 7 years ago

in scalaz (Rose Tree)

mosesn commented 7 years ago

@alexarchambault I don't think using Gen.resize like you do in scalacheck-shapeless doesn't seem like it would make ad-hoc trees simpler.

Clearly this is a common use case, since there are at least two in finagle, there's one in scalaz, and one in scalacheck-shapeless. Maybe it's time for one of the generic implementations to graduate to scalacheck? What do you think, @rickynils @non?

oscar-stripe commented 7 years ago

actually, @tixxit even took a stab at generating DAGs in a generic way.

mosesn commented 7 years ago

@oscar-stripe is that open source? might it get into scalacheck?

mosesn commented 7 years ago

@rickynils have you had a chance to take a look at this? what do you think?

tixxit commented 7 years ago

@mosesn It is not open source (currently just a prototype in a gist somewhere). I'll take a stab at cleaning it up and "releasing" it this weekend though!

mosesn commented 7 years ago

@tixxit oh, cool! have you had a chance to open source it yet?

kevinmeredith commented 7 years ago

On the topic of generating test data for recursive data structures, this answer clarified how to use Gen.sized and Gen.resized:

http://stackoverflow.com/a/42855840/409976.

harpocrates commented 2 years ago

I ran into this and couldn't get scalacheck-shapeless to generate instances (likely due to the fact that the recursion was indirect, the ASTs were wide, and there was some mutual recursion).

Thinking about the problem, I realized that a big part of the issue is that the applicative combinators (mostly thinking of resultOf) pass through the same size to the generators for the subtrees. This tends not to matter too much if your types aren't too big, but it becomes a showstopper in the presence of recursion. Borrowing from the ideas of the SO question above, I came up with these additional combinators:

/** Split the current generator size into the specified number of sub-groups.
   *
   * The sum of the sizes should equal 1 less than the initial generator size. The length of the
   * list returned is equal to the requested number of groups.
   *
   * @param n how many sub-groups to split into?
   * @return size of sub-groups
   */
private[this] def partitionReducedSize(n: Int): Gen[Seq[Int]] =
   for {
     size <- Gen.size
     decrementedSize = size - 1
     if decrementedSize >= 0
     groupSize = decrementedSize / n
     remainder = decrementedSize % n
     groups = List.tabulate(n)(i => if (i < remainder) 1 + groupSize else groupSize)
     shuffledGroups <- Gen.pick(n, groups)
   } yield shuffledGroups.toList

def reducedResultOf[T1: Arbitrary, R](f: T1 => R): Gen[R] =
  for {
    Seq(s1) <- partitionSize(1)
    t1 <- Gen.resize(s1, arbitrary[T1])
  } yield f(t1)

def reducedResultOf[T1: Arbitrary, T2: Arbitrary, R](f: (T1, T2) => R): Gen[R] =
  for {
    Seq(s1, s2) <- partitionSize(2)
    t1 <- Gen.resize(s1, arbitrary[T1])
    t2 <- Gen.resize(s2, arbitrary[T2])
  } yield f(t1, t2)

def reducedResultOf[T1: Arbitrary, T2: Arbitrary, T3: Arbitrary, R](f: (T1, T2, T3) => R): Gen[R] =
  for {
    Seq(s1, s2, s3) <- partitionSize(3)
    t1 <- Gen.resize(s1, arbitrary[T1])
    t2 <- Gen.resize(s2, arbitrary[T2])
    t3 <- Gen.resize(s3, arbitrary[T3])
  } yield f(t1, t2, t3)

// and so on

With this, I was able to just use Gen.lzy + Gen.oneOf for sealed class/sealed trait and reducedResultOf for case class. The "size" of the generator bounds the total number of nodes in the generated AST. The only other gotcha I worked around is to add a biased variant of Gen.oneOf that short-circuits to a base case Gen if the size is too small (since otherwise the tree generator risks failing due to too many skipped test cases).

I do wish that utilities like these came with ScalaCheck (OTOH, I see there isn't much for this in Haskell's QuickCheck either...). If there is interest I can open a PR to upstream these.