Closed phadej closed 4 years ago
This is WIP, I noticed that arbitrarySizedBoundedIntegral
is not uniform,
and have to think how to make Word64
etc generation fast yet take that into account too.
(coincidentally, because of non uniform generation, test-quickcheck-generators
runs faster without this change).
EDIT: so now only changes how Char
is generated.
arbitraryWord64
is not exported; it generates uniform Word64
. I think we should export it (and remember to add @since
annotation).
ping
Those are nice speedups!
I had a go at making a wrapper around bitmaskWithRejection64
, so that this can be used as a replacement for choose
. I ended up with this:
chooseInt :: (Int, Int) -> Gen Int
chooseInt (lo, hi) = do
w <- chooseUpTo (fromIntegral hi - fromIntegral lo)
return (fromIntegral (w + fromIntegral lo))
chooseUpTo :: Word64 -> Gen Word64
chooseUpTo n =
MkGen $ \(QCGen g) _ ->
-- We need to special-case n == maxBound, but may as well do it for all powers of two
if n .&. (n+1) == 0 && n /= 0 then
fst (nextWord64 g) .&. n
else
fst (bitmaskWithRejection64 (n+1) g)
When I use the previous definitions of arbitrary{Unicode,ASCII}Char
but using chooseInt
instead of choose
, it seems to be a tiny bit faster than your patch. So it seems like a good idea to add something like this to QuickCheck.
However, there is a whole load of code out there that currently calls choose
. It would be nice if they could get the benefits of this speed-up too, but I can't think of a good way to do it (without breaking lots of existing code). Any ideas?
@nick8325 I see. Looks like I have a bug in bitmaskWithRejection64'
which affects performance.
EDIT, no I don't. At least not what I thought about.
@nick8325 is your chooseUpTo
faster than bitmaskWithRejection64'
(with a prime, which picks from closed-closed
interval).
If it so, I'll rather use that in splitmix
.
I suspect that in case of arbitraryASCIIChar
the resulting GHC Core should be very close, though bitmaskWithRejection64'
would do an extra >
comparison (to reject overflows, which is never true).
About choose
, there seems to be some work to make random
not cripple underlying generators in https://github.com/idontgetoutmuch/random/pull/1, so I wouldn't try to do anything like that in QuickCheck
.
Something which would benefit fromchooseInt
like thing is arbitrarySizedBoundedIntegral
, generating bare numbers is I common. I didn't manage to get it right though when I tried. (There could be a variant of arbitrarySizedBoundedIntegral
when we know that the range is exactly 2 ^ n
).
Ah, I hadn't noticed bitmaskWithRejection64'
before (I was looking at an old version of splitmix
, apparently). Things are nice and fast when I just use that directly.
But there is a performance bug in the unprimed bitmaskWithRejection64
: when you call it with range
a power of two, the mask is 1 bit longer than what's actually needed. For example, bitmaskWithRejection64 1024
uses mask = 2047
when mask = 1023
would suffice. So, half the words generated would be rejected. This could certainly explain the performance difference.
I also tried implementing chooseInteger :: (Integer, Integer) -> Integer
, which uses bitmaskWithRejection64
if the arguments are within the range of a Word64
and falls back to choose
otherwise. This is slightly slower than chooseInt
, but not by much (rough figures from a benchmark I ran: 15s using chooseInt
, 16s using chooseInteger
, 30s using choose
).
I think a simple way we could do this is to add chooseInt
and chooseInteger
, and just go through QuickCheck and replace each call to choose
with one of them where possible. That should also handle arbitrarySizedBoundedIntegral
. Hopefully, this will even become unnecessary once the improvements to random
get released.
About handling power of two ranges specially - I originally marked chooseUpTo
INLINE for that reason (if the range is a power-of-two constant then the inlined code simplifies to a call to nextWord64
and a bitwise AND), but it didn't seem to help performance. If we statically know that the range is a power of two then potentially we can save a comparison and a call to countLeadingZeros
, but perhaps this isn't very significant. We can measure of course.
I just pushed a patch that adds chooseInt
, chooseInteger
, chooseBoundedIntegral
and chooseEnum
functions, which have the same API as choose
but use bitmaskWithRejection64'
if possible (this means, if the lower and upper bounds of the range fit inside an Int64
). I replaced all calls to choose
inside QuickCheck with one of these functions.
On my machine, the time taken to generate 100000 random strings goes down from 56s to 11s! Also, the tests in the 'examples' directory run twice as fast, perhaps because oneof
now uses chooseInt
instead of choose
.
The definition of arbitrarySizedBoundedIntegral
still goes through Integer
, so there may be room for speedups here. Also, generating a uniform Word64
will currently invoke choose
, because it doesn't fit inside Int64
.
Could you perhaps try your benchmarks with the new patch and report back?
Oh, looks like I broke something. Hold on...
Int64
and Word64
are exactly the same bitwise. I don't understand
Also, generating a uniform
Word64
will currently invoke choose, because it doesn't fit insideInt64
.
chooseBoundedIntegral (lo, hi)
checks that both lo
and hi
are in the range of an Int64
(and otherwise reverts to calling choose
). So chooseBoundedIntegral (minBound, maxBound) :: Gen Word64
will fail that check because maxBound :: Word64
is outside the range of an Int64
.
Can obviously be fixed, just needs a bit more programming...
@nick8325 I see. This is exactly why I didn't change it myself. Thinking about over/undeflows is tricky, as type-system doesn't help there.
It seems that arbitrarySizedBoundedIntegral
is indeed still slow. I think it can be fixed, but I'll need to think a bit more...
(But arbitraryBoundedIntegral
on Word64
is fast now, at least...)
OK, now arbitrarySizedBoundedIntegral
is decently fast too!
@nick8325 great!
In the test-suite which prompted this patch (Char
heavy) whole test-suite running time went down from roughly from 7.5s to 3.5s using current master 9f678a938325f2ce113c46b1123bf6c51fd825ad of QuickCheck. This is very nice!
Few distributions (with label
) I print out looks similar.
Do you want me do something additional?
No, that's fine! I'll make a new release of QuickCheck soon.
SplitMix generates uniform Word64, by using this knowledge we can generate numbers more efficiently.
bitmaskWithRejection64' n
behaves aschoose (0, n)
, but is more efficient.In a small tests-suite generating list of characters, the speedup is noticeable (max-size is 1000):
speeds up to: