Open idontgetoutmuch opened 9 years ago
So this might not be a problem just with Categorical.
{-# OPTIONS_GHC -Wall #-}
{-# OPTIONS_GHC -fno-warn-name-shadowing #-}
{-# OPTIONS_GHC -fno-warn-type-defaults #-}
import Data.Random.Source.PureMT
import Data.Random
import Data.Random.Distribution.Categorical
import Data.Random.Distribution.T
import Control.Monad.State
testUniform :: MonadRandom m => Int -> m [Double]
testUniform n = replicateM (fromIntegral n) (sample stdUniform)
n :: Int
n = 10^7
us :: [Double]
us = evalState (testUniform n) (pureMT $ fromIntegral arbSeed)
mean :: Double
mean = sum us / fromIntegral n
main :: IO ()
main = do
print mean
~/Dropbox/Private/Multinomial $ ./Multinomial +RTS -s
0.4999607889729769
5,122,925,704 bytes allocated in the heap
1,213,648 bytes copied during GC
44,312 bytes maximum residency (2 sample(s))
21,224 bytes maximum slop
1 MB total memory in use (0 MB lost due to fragmentation)
Tot time (elapsed) Avg pause Max pause
Gen 0 10015 colls, 0 par 0.03s 0.03s 0.0000s 0.0000s
Gen 1 2 colls, 0 par 0.00s 0.00s 0.0001s 0.0002s
INIT time 0.00s ( 0.00s elapsed)
MUT time 1.15s ( 1.16s elapsed)
GC time 0.03s ( 0.03s elapsed)
EXIT time 0.00s ( 0.00s elapsed)
Total time 1.18s ( 1.20s elapsed)
%GC time 2.3% (2.8% elapsed)
Alloc rate 4,452,445,320 bytes per MUT second
Productivity 97.7% of total user, 96.1% of total elapsed
Here's the matlab equivalent.
num_particles = 10^7;
tic();
s = sum(rand(num_particles,1));
toc()
octave> num_particles = 10^7;
octave> tic();
octave> s = sum(rand(num_particles,1));
octave> toc()
Elapsed time is 0.149504 seconds.
According to http://octave.sourceforge.net/octave/function/rand.html, octave uses the Mersenne Twister with a period of 2^19937-1. I thought that was what random-fu also used. So my next step will be to look at the random number generator and see if that is causing the 10x slowdown.
I've spent a bit of time investigating mwc-random see: https://github.com/bos/mwc-random/issues/35 and I can generate uniform samples about x10 faster but with what appears to be a space leak.
The same in Octave