typelevel / scalacheck

Property-based testing for Scala
http://www.scalacheck.org
BSD 3-Clause "New" or "Revised" License
1.94k stars 407 forks source link

Generating a list of size N where each element is distinct. #352

Open wjlow opened 7 years ago

wjlow commented 7 years ago

I'm proposing something like:

def sizedDistinctListOf[T](size: Int, gen: Gen[T], eq: (T, T) => Boolean): Gen[List[T]]

In my own codebase, I've got something similar where T is constrained by the Eq typeclass from Cats. This allows me to define what distinctiveness means. From my understanding, ScalaCheck has no dependency on Cats.

For instance, you may have a case class Person(name: String, age: Int) and for a specific scenario, you might want to generate List[Person] where each element contains a different name and age pair, whereas for some other cases, you might want to generate List[Person] where each element has a different age.

Happy to submit a PR if people find this useful.

ashawley commented 7 years ago

How about?

val genPersonSameName: Gen[Person] = arbitrary[Person].map(_.copy(name = "Same Name"))
val genPersonSameAge: Gen[Person]  = arbitrary[Person].map(_.copy(age  = 42))

Then you can generate distinct lists with Set and containerOfN:

val genPersonDiffName: Gen[List[Person]] = containerOfN[Set, Person](n, genPersonSameAge).map(_.toList)
val genPersonDiffAge: Gen[List[Person]]  = containerOfN[Set, Person](n, genPersonSameName).map(_.toList)

I don't enough about cats to know if Eq cooperates well with scala collections.

ashawley commented 7 years ago

Looks like there is an effort to make cats work with scala collections, including Set:

https://github.com/non/alleycats

And also an initiative to build cats extensions to Scalacheck:

https://github.com/non/cats-check

wjlow commented 7 years ago

Thanks for the suggestion @ashawley. I feel like this approach isn't as obvious though. Can we provide a simple function that generates a collection of T provided a way to define distinctiveness?

Your approach works and I like how it works by composing Gens. However I feel like it's easier to provide a function that determines equality (A,A) => Boolean as opposed to having to write a Gen for for it instead.

Not sure if this makes sense.

morgen-peschke commented 6 years ago

I've found this sort of thing useful in the past, and have opened #394 with a potential implementation.

ashawley commented 6 years ago

It could be useful to have a simpler syntax for this kind of thing. Unfortunately, it is difficult to reliably generate distinct sets of values. This is why it was relaxed in #89:

https://github.com/rickynils/scalacheck/blob/0a2f1c5/src/main/scala/org/scalacheck/Gen.scala#L609

The property tests for containerOfN for Set and mapOfN are relaxed to be only _.size <= n rather than _.size == n:

https://github.com/rickynils/scalacheck/blob/0a2f1c5/jvm/src/test/scala/org/scalacheck/GenSpecification.scala#L172-L178

morgen-peschke commented 6 years ago

@ashawley - that's definitely an important point.

I think the main point of this feature request isn't to try to guarantee a particular size (though a best effort there is always appreciated), as much as it is to have some way of customizing what "distinct" means for a particular data set.

Best-effort or fail-fast semantics are nice alternatives to have, as that can be useful to when tweaking our Gens.

SethTisue commented 3 years ago

there is a PR at #394 that made it almost all the way through review, but the author doesn't currently have time to finish it off. perhaps another volunteer would like to pick it up?