Size parameter for float / double instances doesn’t behave the same as in other fixed size types

cartazio commented 4 years ago

‘’’

x)))

-- | Generates a fractional number. The number can be positive or negative -- and its maximum absolute value depends on the size parameter. arbitrarySizedFractional :: Fractional a => Gen a arbitrarySizedFractional = sized $ \n -> let n' = toInteger n in do b <- chooseInteger (1, precision) a <- chooseInteger ((-n') b, n' b) return (fromRational (a % b)) where precision = 9999999999999 :: Integer

‘’’ Is used for float and double. Ignoring then big rational stuff going on here being a performance issue.

In other types like int and word the size parameter essentially determines the number of distinct values that can be generated. I noticed this because some qc tests started failing for Lists of float / double that were passing for lists of int and word. In my case this was because my property checks require testing inputs that have at least one duplicate value in a list, which happens plenty for int and word as defined in qc gen instances but never for float and double.

One approach that would avoid breaking any users who benefit from the current behavior while making it more robust would be to have the gen instance be 1/2 probability the current behavior and 1/2 probability generating integral points from plus/minus the size range.

This is also ignoring some opportunities to make the generator way more performant on float and double.

Since they might be interested : cc @Bodigrim @hvr @chessai @Shimuuar

This issue came up in the course of cleaning up the vector test suite and finding some issues with 1-3 tests :)

cartazio commented 4 years ago

To clarify : with the current float / double generator , you have essentially zero probability of ever generating duplicate values. Or zero!

cartazio commented 4 years ago

Which means in lots of contexts that most tests will kinda pass by accident if the bad cases require a zero or duplicate values

phadej commented 4 years ago

Why would you use Double instead of Int for tests where it's expected that duplicates are expected to trigger corner cases?

I do agree that Double and Float ought to have special instances. They are too different from Rational.

Lysxia commented 4 years ago

Why would you use Double instead of Int for tests where it's expected that duplicates are expected to trigger corner cases?

One use case is to test monomorphic functions that only operate on Double. Maybe you can often generalize to avoid the situation, but it doesn't seem outlandish that you might still end up in such a pinch.

phadej commented 4 years ago

@Lysxia and vector did that?

Lysxia commented 4 years ago

It was acknowledged that tests with both Int and Double were or are now present, so it didn't matter for vector, so I took your question instead to mean "why would anyone care that Double ever generates duplicates", and that's what I was answering. Sorry if I misunderstood the intention behind your question.

cartazio commented 4 years ago

In vector I was Tightening up some code to have more precise properties to make sure that my specification of maxIndexBy was correct.

Property being ‘ ls /= [] && uniq ls /= ls ===> V.maxIndexBy $ v.fromList ls == Prelude.maxIndexBy ls’

After doing this I saw that the test suite was failing to generate any lists of float/ double with duplicate entries. While the int/word tests did!

I absolutely agree that for normal polymorphic code double isn’t especially important: EXCEPT to catch bugs in rewrite rules. Which is what happened with adding it to vector test suite.

Zooming out , I do think it’s important to make sure that perhaps the floating point generator includes a nontrivial probability of standard weird floating point values like plus minus zero, plus some of the extreme Small and big numbers?

nick8325 / quickcheck

Size parameter for float / double instances doesn’t behave the same as in other fixed size types #295