Open Boarders opened 5 years ago
I only glanced at code. I can't see anything obviously wrong but I suspect that problem is lack of specialization. You could try to add INLINE
/SPECIALIZE
pragmas to the worker functions in Mutable.Shuffle
Here's a simplistic benchmark: BenchShuffle.zip.
Switch the random number generator on line 40 of BenchShuffle.hs
, and switch to the corresponding branch of @Boarders' perfect-vector-shuffle library in the cabal.project
file, to compare MWC with System.Random. The results I have obtained are as follows:
random-1.1
shuffle/10^6
| lower bound | estimate | upper bound
OLS regression | 1.31 s | 1.32 s | 1.34 s
R² goodness-of-fit | 1.000 | 1.000 | 1.000
Mean execution time | 1.30 s | 1.31 s | 1.31 s
Standard deviation | 6.10 ms | 7.48 ms | 8.56 ms
mwc-random-0.14.0.0
shuffle/10^6
| lower bound | estimate | upper bound
OLS regression | 1.94 s | 1.97 s | 1.98 s
R² goodness-of-fit | 1.000 | 1.000 | 1.000
Mean execution time | 1.97 s | 1.98 s | 1.98 s
Standard deviation | 2.76 ms | 4.49 ms | 5.48 ms
and for comparison tf-random-0.5:
shuffle/10^6
| lower bound | estimate | upper bound
OLS regression | 970 ms | 1.05 s | 1.08 s
R² goodness-of-fit | 0.999 | 0.999 | 1.000
Mean execution time | 1.04 s | 1.05 s | 1.06 s
Standard deviation | 10.6 ms | 12.7 ms | 14.0 ms
@Shimuuar : Adding INLINE, SPECIALISE, INLINEABLE and the ghc-options -fexpose-all-unfoldings and -fspecialize-aggressively seems to make no difference and it is still significantly slower than System.Random without any of those added (it also does not seem to affect the performance if I add those pragmas there though).
I tried to reproduce issue and wasn't able to build reproducer. Is this still actual?
I wrote a tiny utility for performing a Fisher–Yates shuffle here. At first I wrote this using System.Random and StdGen but was informed this was supposedly extremely slow. I then updated to use this library here. Surprisingly this led to worse performance on every experiment that I ran. Is this the wrong use case for this library (e.g. generating a million Ints between from i to 1000000 as i goes from 1 to 999999)? Or am I doing something else wrong? I did make sure to only generate a single generator but I am not very informed about the ins-and-outs of RNG and so this might simply be an inappropriate use case.