Binomial Sampler - Githubissues

idontgetoutmuch commented 5 months ago

I am in the middle of implementing Kachitvichyanukul, V. and Schmeiser, B. W. (1988) but the inverse method (in the current PR) could be a first step. What do you think?

I tried just sampling from Bernoulli n times but this is very inefficient.
Using the saved calculation of q^n for the inverse method doesn't seem to give a performance improvement.

Shimuuar commented 5 months ago

That would be great addition! There's implementation using condensed tables algorithms but it requires costly setup. Also there's inverse incomplete beta in math-functions which is somewhat tested and may have better performance.

P.S. I'll check whether CI still works.

idontgetoutmuch commented 5 months ago

P.S. I'll check whether CI still works.

It does not :-(

idontgetoutmuch commented 5 months ago

I've added some benchmarks but I get

Name    Mean
mwc/CT/gen/binomial 0.5 4   5.79E-09
mwc/CT/gen/binomial 0.1 10  5.78E-09
mwc/CT/gen/binomial 0.6 10  5.78E-09
mwc/CT/gen/binomial 0.8 10  5.78E-09
mwc/CT/gen/binomial 0.4 100 5.79E-09
mwc/CT/gen/binomial 0.4 1400    5.79E-09
mwc/CT/gen/binomiak 0.5 4   1.06E-07
mwc/CT/gen/binomiak 0.1 10  1.03E-07
mwc/CT/gen/binomiak 0.6 10  1.23E-07
mwc/CT/gen/binomiak 0.8 10  1.15E-07
mwc/CT/gen/binomiak 0.4 100 3.10E-07
mwc/CT/gen/binomiak 0.4 1400    2.19E-07

So it would seem that my method is x50 slower. But in https://github.com/idontgetoutmuch/mwc-random/blob/master/System/Random/MWC/Binomial.hs I can run

ghc -O2 System/Random/MWC/Binomial.hs -main-is mainTab -o mainTab

./mainTab +RTS -s
559.95944
 202,045,775,928 bytes allocated in the heap
      88,204,248 bytes copied during GC
       4,872,176 bytes maximum residency (2 sample(s))
         104,464 bytes maximum slop
              22 MiB total memory in use (1 MB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0     32758 colls,     0 par    0.092s   0.131s     0.0000s    0.0035s
  Gen  1         2 colls,     0 par    0.002s   0.007s     0.0034s    0.0062s

  INIT    time    0.000s  (  0.001s elapsed)
  MUT     time    6.676s  (  6.724s elapsed)
  GC      time    0.094s  (  0.138s elapsed)
  EXIT    time    0.000s  (  0.007s elapsed)
  Total   time    6.771s  (  6.871s elapsed)

  %GC     time       0.0%  (0.0% elapsed)

  Alloc rate    30,262,537,823 bytes per MUT second

  Productivity  98.6% of total user, 97.9% of total elapsed

So 6.8 seconds via the table method vs

ghc System/Random/MWC/Binomial.hs -O2 -main-is mainTPE -o mainTPE

./mainTPE +RTS -s
559.97745
      98,831,560 bytes allocated in the heap
      33,920,168 bytes copied during GC
       7,810,184 bytes maximum residency (4 sample(s))
          29,400 bytes maximum slop
              23 MiB total memory in use (0 MB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0        21 colls,     0 par    0.012s   0.013s     0.0006s    0.0031s
  Gen  1         4 colls,     0 par    0.006s   0.009s     0.0023s    0.0036s

  INIT    time    0.000s  (  0.002s elapsed)
  MUT     time    0.036s  (  0.037s elapsed)
  GC      time    0.018s  (  0.022s elapsed)
  EXIT    time    0.000s  (  0.001s elapsed)
  Total   time    0.054s  (  0.061s elapsed)

  %GC     time       0.0%  (0.0% elapsed)

  Alloc rate    2,735,746,000 bytes per MUT second

  Productivity  66.8% of total user, 60.2% of total elapsed

So 0.05 seconds for the TPE (triangle parallelogram exponential) method making it about x10 faster.

I don't know what benchmarking actually measures but maybe you @Shimuuar can explain what is going on?

Shimuuar commented 5 months ago

It does not :-(

I fixed CI and added PAPI-based benchmarks. It turns out they are quite useful for making a lot of measurements quickly I'm going to look into what current benchmarks actually measure.

idontgetoutmuch commented 5 months ago

I ran some benchmarks. If I use

liftM sum $ replicateM 1 $ genFromTable (tableBinomial n p) mwc

and

liftM sum $ replicateM 1 $ binomial n p mwc

then the TPE method is faster but if I use

genFromTable (tableBinomial n p) mwc

and

binomial n p mwc

then the table method is faster. I am guessing the table is calculated once in this case and cached somehow but not if replicateM is used. So which method is really faster or does it depend on your application?

Name	Mean (ps)	Name	Mean (ps)	Ratio

100 repetitions

binomial 0.5 4	78262646	binomiak 0.5 4	45466943	1.721308732
binomial 0.1 10	336691796	binomiak 0.1 10	45204858	7.44813303
binomial 0.6 10	438881835	binomiak 0.6 10	47054516	9.327092749
binomial 0.8 10	419028320	binomiak 0.8 10	46269506	9.056252297
binomial 0.4 100	1625602343	binomiak 0.4 100	134359570	12.0988951
binomial 0.4 1400	7465168750	binomiak 0.4 1400	102613574	72.75030446

10 repetitions

binomial 0.5 4	7834100	binomiak 0.5 4	4526589	1.73068507
binomial 0.1 10	33740417	binomiak 0.1 10	4501223	7.495833244
binomial 0.6 10	44393627	binomiak 0.6 10	4695062	9.455386745
binomial 0.8 10	42145361	binomiak 0.8 10	4608663	9.144812932
binomial 0.4 100	162304394	binomiak 0.4 100	13418182	12.09585576
binomial 0.4 1400	739081640	binomiak 0.4 1400	10178796	72.60992754

One repetition

binomial 0.5 4	785509	binomiak 0.5 4	453731	1.731221803
binomial 0.1 10	3362857	binomiak 0.1 10	451447	7.449062681
binomial 0.6 10	4452917	binomiak 0.6 10	474554	9.383372598
binomial 0.8 10	4206065	binomiak 0.8 10	465411	9.037313256
binomial 0.4 100	16207208	binomiak 0.4 100	1339769	12.09701672
binomial 0.4 1400	73531445	binomiak 0.4 1400	1022201	71.93442875

No repititions

binomial 0.5 4	7444	binomiak 0.5 4	448100	0.016612363
binomial 0.1 10	7527	binomiak 0.1 10	443362	0.016977098
binomial 0.6 10	7644	binomiak 0.6 10	461722	0.016555416
binomial 0.8 10	7587	binomiak 0.8 10	454549	0.01669127
binomial 0.4 100	8396	binomiak 0.4 100	1335464	0.006286953
binomial 0.4 1400	9986	binomiak 0.4 1400	1015023	0.009838201

idontgetoutmuch commented 5 months ago

I tried running these (which is why I needed a faster binomial sampler):

betaBinomialNaive :: StatefulGen g m =>
                Double -> Double -> Int -> g -> m Int
betaBinomialNaive a b n g = do
  p <- beta a b g
  xs <- fmap (fmap fromEnum) $ replicateM n $ bernoulli p g
  return $ sum xs

betaBinomialTPE :: StatefulGen g m =>
                Double -> Double -> Int -> g -> m Int
betaBinomialTPE a b n g = do
  p <- beta a b g
  binomial n p g

betaBinomialTable :: StatefulGen g m =>
                  Double -> Double -> Int -> g -> m Int
betaBinomialTable a b n g = do
  p <- beta a b g
  genFromTable (tableBinomial n p) g

The TPE method is about x100 faster so I am still baffled as to what the benchmarks are measuring.

Shimuuar commented 5 months ago

I've looked into benchmarks and have no idea what they're measure for condensed table benchmarks either. They should measure either sample generation time or table setup time (which is expensive). They measure some mix of two.

There's no other way. I'll have to rewrite them

Shimuuar commented 5 months ago

Yes. Current benchmarks in CT/gen group measure time for setting up table. Which is of course wrong-wrong-wrong.

idontgetoutmuch commented 4 months ago

I am putting my scribbles here in case I need them. I am cleaning up the PR so it can be merged without "squeezing". I can do that later and measure performance improvements. I am following the gsl implementation https://git.savannah.gnu.org/cgit/gsl.git/tree/randist/binomial_tpe.c.


-- The Rust author also comments
--
--  * It is possible for BINV to get stuck, so we break if x >
-- BINV_MAX_X and try again.
--
-- * It would be safer to set BINV_MAX_X to self.n, but it is
-- extremely unlikely to be relevant.
--
-- * When n*p < 10, so is n*p*q which is the variance, so a result >
-- 110 would be 100 / sqrt(10) = 31 standard deviations away.


  -- if uPrev > rPrev

          -- if xPrev > bInvMaxX
          -- then foo s a undefined

bInvMaxX :: Int
bInvMaxX = 110

idontgetoutmuch commented 4 months ago

@Shimuuar this can be further improved but can be merged - maybe I should add some tests (not that the other distributions have any). I didn't want to do too much more until you had bottomed out the benchmarking. Are tables really faster? They are according to the benchmarks but not when I use that method in a program.

Shimuuar commented 4 months ago

I'm going to fix benchmarks tomorrow.

Tables have rather peculiar performance characteristic: sampling from table should be fast. AFAIR on the order of single Word32 but creation of table could be very expensive. That cost also depends on distribution: more possible outcomes results in table that's costlier to setup and could be easily 1e2-1e4 costlier. So it's rather poor algorithm if not amortized by sampling a lot. Especially if distribution's parameters are random variates themselves

P.S. I haven't looked at PR in details yet

idontgetoutmuch commented 4 months ago

@Shimuuar ready for review :-)

Shimuuar commented 4 months ago

Great! I'm still trying to fix benchmarks.

Shimuuar commented 4 months ago

Regarding performance of condensed tables. Yes sampling is fast. Below is sampling time for Poisson distribution for different value of parameter and GHC versions in CPU cycles

And here is uniform for comparison

In contrast creation of table is expensive

idontgetoutmuch commented 4 months ago

Now that's proper impenetrable numerical codes :) It will take some time for me to understand what's going on

I am hope I am not "teaching grandmothers to suck eggs" but it is a rejection sampling method with an empirical proposal distribution (which I believe they prove to have the correct support in a reference which seems can only be obtained by paying a lot of money). They divide up the region into the triangular segment, two parallelogram segments and two segments covered by the exponential distribution. I might try and draw the picture and put it in the docs. The triangular region is below the target distribution so if you are there then you always accept. You then decide where you are in the parallel or exponential regions and do an accept / reject (depending on whether you are above or below the target - sorry for being obvious here).

Further they do something they call squeezing (to improve the acceptance rate?) which I haven't investigated yet but will be a future PR. HTH.

idontgetoutmuch commented 4 months ago

lineChart3

idontgetoutmuch commented 4 months ago

I hope that is now clear: if you are in area "One" then you are below the target so you always accept. For areas "Two", "Three" and "Four" you have to decide whether you are above or below and then reject or not. See the original description which I think has an error in the majorizing and minorizing functions which is luckily benign except when you try to recreate the diagram.

Shimuuar commented 4 months ago

Sorry I was terribly busy lately and only got to skip paper and PR.

My plan is to write better test for sampler: using likelihood ratio check that we sample from correct distribution and if everything goes well just merge PR and do further documentation improvements/microptimizations separately

idontgetoutmuch commented 4 months ago

Should this be in the repo to recreate the figure? Probably not but it would be nice to put it somewhere other than in a PR comment.

{-# LANGUAGE ScopedTypeVariables #-}

{-# OPTIONS_GHC -Wall              #-}
{-# OPTIONS_GHC -Wno-type-defaults #-}

module RecreateFigure(lineChart) where

import System.FilePath ()
import Data.Colour
import Diagrams.Backend.SVG.CmdLine
import Diagrams.Prelude hiding ( sample, render, Vector, offset )
import System.Environment

majorizingFn :: Int -> Double -> Double -> Double
majorizingFn i pp x
  | x <= xL - 0.5 = c * exp (-lambdaL * (xL - x - 0.5))
  | x <= xR - 0.5 = (1 + c) - abs (bigM - x) / p1
  | otherwise     = c * exp (-lambdaR * (x + 0.5 - xR))
  where
    (xL, xR, _, c, p1, bigM, fm, q, r) = xLRMc i pp
    a = (fm - xL) / (fm - xL * r)
    lambdaL = a * (1 + a / 2)
    b = (xR - fm) / (xR * q)
    lambdaR = b * (1 + b / 2)

minorizingFn :: Int-> Double -> Double -> Double
minorizingFn i pp x
  | x <= xL - 0.5 = 0.0
  | x <= xR - 0.5 = 1.0 - abs (bigM - x) / p1
  | otherwise     = 0.0
  where
    (xL, xR, _, _, p1, bigM, _, _, _) = xLRMc i pp

xLRMc :: Int -> Double ->
         (Double, Double, Double, Double, Double, Double, Double, Double, Double)
xLRMc i pp = (xL, xR, xM, c, p1, bigM, fm, q, r)
  where
    m :: Double
    m = fromIntegral i
    r = min pp (1 - pp)
    q = 1 - r
    fm = m * r + r
    bigM :: Double
    bigM = fromIntegral $ floor fm
    p1 = (fromIntegral (floor (2.195 * sqrt (m * pp * q) - 4.6 * q) :: Int)) + 0.5
    xM = bigM + 0.5
    xL = xM - p1
    xR = xM + p1
    c  = 0.134 + 20.5 / (15.3 + bigM)

n :: Int
n = 200

p :: Double
p = 0.25

m1, sd :: Double
m1 = (fromIntegral n) * p + p
sd = sqrt $ (fromIntegral n) * p * (1- p)

lb, ub :: Double
lb = m1 - 3 * sd
ub = m1 + 3 * sd

tick, skala :: Double
tick = 0.01
skala = 20.0

majors :: [(Double, Double)]
majors = [(x, majorizingFn n p x) | x <- [lb, lb + tick .. ub]]

minors :: [(Double, Double)]
minors = [(x, minorizingFn n p x) | x <- [lb, lb + tick .. ub]]

integralBinomialPDF :: (Integral a, Fractional b) => a -> b -> a -> b
integralBinomialPDF t q 0 = (1 - q)^t
integralBinomialPDF t q x = integralBinomialPDF t q (x - 1) * a * b
  where
    a = fromIntegral (t - x + 1) / fromIntegral x
    b = q / (1 - q)

binPdfs :: [Double]
binPdfs = map (/ v) $ map (integralBinomialPDF n p) $ map floor $ map (+ 0.5) [lb :: Double, lb + tick .. ub]

v :: Double
v = integralBinomialPDF n p bigM
  where
    bigM = floor $ (fromIntegral n) * p + p

majorVs :: [P2 Double]
majorVs = map p2 majors

binPdfVs :: [P2 Double]
binPdfVs = map p2 $ zip (map fst majors) binPdfs

minorVs :: [P2 Double]
minorVs = map p2 minors

lineChart :: String -> IO ()
lineChart fn = do
  withArgs ["-h 800", "-w 800", "-o" ++ fn ++ ".svg"] (mainWith exampleQ)

exampleQ :: Diagram B
exampleQ = strutX 2 ||| mconcat
  [ mconcat cs # fc purple # lw none

  -- x-axis
  , strokeLocT xAxisT  # lc black # lwL 0.01 # scaleY skala
  , moveTo (p2 (xM - offset , - 2.0)) $ text "Number of Successes"
  , strokeLocT brokenXAxisT # lc black # lwL 0.05

  -- y-axis
  , strokeLocT yAxisT  # lc black # lwL 0.01 # scaleY skala
  , moveTo (p2 (-sd - yAxixOffset - 3.0 , skala * 1.0)) $ text "Probability Density (Unnormalised)" # rotateBy (1/4)

  -- Key
  , let majKey = fromVertices $ map p2 [ (xM - offset - 10.0, skala * 2.0 - 2.0)
                                       , (xM - offset, skala * 2.0 - 2.0)
                                       ]
    in strokeLocT majKey # lc blue # lwL 0.1 ===
       moveTo (p2 (xM - offset + 8, skala * 2.0 - 4.0)) (text "Majorizing Function" # fc blue)
  , let minKey = fromVertices $ map p2 [ (xM - offset - 10.0, skala * 2.0 - 4.0)
                                       , (xM - offset, skala * 2.0 - 4.0)
                                       ]
    in strokeLocT minKey # lc green # lwL 0.1 ===
       moveTo (p2 (xM - offset + 8, skala * 2.0 - 6.0)) (text "Minorizing Function" # fc green)
  , let tgtKey = fromVertices $ map p2 [ (xM - offset - 10.0, skala * 2.0 - 6.0)
                                       , (xM - offset, skala * 2.0 - 6.0)
                                       ]
    in strokeLocT tgtKey # lc red # lwL 0.1 ===
       moveTo (p2 (xM - offset + 8, skala * 2.0 - 8.0)) (text "Target PDF" # fc red)
  , moveTo (p2 (xM - offset - 3 * sd, skala * 2.0 - 10.0)) (text ("n = " ++ show n)  # fc black)
  , moveTo (p2 (xM - offset - 3 * sd, skala * 2.0 - 11.0)) (text ("p = " ++ show p)  # fc black)

  -- Areas
  , area1, area2, area3, area4
  , moveTo (p2 (xM - 0.5 - offset, 0.5 * skala * xMinM)) $ text "One"
  , moveTo (p2 (xM - 0.5 - offset, skala * (0.5 * (xMajM - xMinM) + xMinM))) $ text "Two"
  , moveTo (p2 (lb + (xL - lb) / 2 - offset, 0.5 * skala * majorizingFn n p (lb + (xL - lb) / 2))) $ text "Three"
  , moveTo (p2 (ub + (xR - ub) / 2 - offset, 0.5 * skala * majorizingFn n p (ub + (xR - ub) / 2))) $ text "Four"

  , strokeLocT majorT  # lc blue  # lwL 0.01 # scaleY skala
  , strokeLocT binPdfT # lc red   # lwL 0.01 # scaleY skala
  , strokeLocT minorT  # lc green # lwL 0.01 # scaleY skala
  , strokeLocT verticalLT # lc purple # lwL 0.01
  , strokeLocT verticalRT # lc purple # lwL 0.01

  , moveTo (p2 (xL - 0.5 - offset, - 1.0)) $ text $ show (xL - 0.5)
  , moveTo (p2 (xR - 0.5 - offset, - 1.0)) $ text $ show (xR - 0.5)
  , moveTo (p2 (0.0, - 1.0)) $ text $ show $ (fromIntegral (round (lb * 100))) / 100
  , moveTo (p2 (ub - offset, - 1.0)) $ text $ show $ (fromIntegral (round (ub * 100))) / 100
  , let y = 1.5 in moveTo (p2 (-sd - yAxixOffset - 1.0, skala * y)) $ text $ show $ (fromIntegral (round (y * 100))) / 100
  , let y = 1.0 in moveTo (p2 (-sd - yAxixOffset - 1.0, skala * y)) $ text $ show $ (fromIntegral (round (y * 100))) / 100
  , let y = 0.5 in moveTo (p2 (-sd - yAxixOffset - 1.0, skala * y)) $ text $ show $ (fromIntegral (round (y * 100))) / 100
  ]
  where
    xAxisT :: Located (Trail V2 Double)
    xAxisT = (fromVertices $ map p2 [(m1 - 4 * sd, 0), (m1 + 4 * sd, 0)]) `at` (-sd ^& 0)

    brokenXAxisT :: Located (Trail V2 Double)
    brokenXAxisT = fromVertices $ map p2 [ (m1 - 4 * sd - offset - 0.0,  0.0)
                                         , (m1 - 4 * sd - offset - 0.1,  0.3)
                                         , (m1 - 4 * sd - offset - 0.2,  0.0)
                                         , (m1 - 4 * sd - offset - 0.3, -0.3)
                                         , (m1 - 4 * sd - offset - 0.4,  0.0)
                                         , (m1 - 4 * sd - offset - 2.0,  0.0)
                                         ]

    yAxixOffset = 2.0
    yAxisT :: Located (Trail V2 Double)
    yAxisT = fromVertices $ map p2 [(-sd - yAxixOffset , 0.0), (-sd - yAxixOffset , 2.0)]

    binPdfT :: Located (Trail V2 Double)
    binPdfT = fromVertices binPdfVs `at` (0 ^& (minimum binPdfs))

    majorT :: Located (Trail V2 Double)
    majorT = fromVertices majorVs `at` (0 ^& (minimum (map snd majors)))

    minorT :: Located (Trail V2 Double)
    minorT = fromVertices minorVs `at` (0 ^& (minimum (map snd minors)))

    verticalLT :: Located (Trail V2 Double)
    verticalLT = fromVertices $ map p2 [(xL - 0.5 - offset, 0.0)
                                      , (xL - 0.5 - offset, skala * xMajL)
                                      ]

    verticalRT :: Located (Trail V2 Double)
    verticalRT = fromVertices $ map p2 [ (xR - 0.5 - offset, 0.0)
                                       , (xR - 0.5 - offset, skala * xMajR)
                                       ]

    pt1 = (xL - 0.5, 0.0)
    pt2 = (xR - 0.5, 0.0)
    pt3  = (xM - 0.5, skala * xMinM)

    area1 :: Diagram B
    area1 = t # fc red # opacity 0.1
      where
        t :: Diagram B
        t = strokeLocT $ mapLoc Trail u
        u :: Located (Trail' Loop V2 Double)
        u = ((fromVertices $ map p2 [pt1, pt2, pt3]) # closeLine) `at`  ((xL - offset -0.5) ^& 0)

    area2 :: Diagram B
    area2 = t # fc blue # opacity 0.1
      where
        t :: Diagram B
        t = strokeLocT $ mapLoc Trail u
        u :: Located (Trail' Loop V2 Double)
        u = (fromVertices ([w1] ++ vs ++ [wn, w2]) # closeLine) `at` ((xL - offset - 0.5) ^& 0)
        vs = fmap (scaleY 20.0) $ filter (\x -> fst (unp2 x) <= xR - 0.5) $ filter (\x -> fst (unp2 x) > xL - 0.5) majorVs
        w1 = p2 (fst (unp2 (head vs)), 0.0)
        w2 = p2 (xM - 0.5, skala * xMinM)
        wn = p2 (fst (unp2 (last vs)), 0.0)

    area3 :: Diagram B
    area3 = t # fc yellow # opacity 0.1
      where
        t :: Diagram B
        t = strokeLocT $ mapLoc Trail u
        u :: Located (Trail' Loop V2 Double)
        u = (fromVertices ([w1] ++ vs ++ [wn]) # closeLine) `at` (0 ^& 0)
        vs = fmap (scaleY 20.0) $ filter (\x -> fst (unp2 x) <= xL - 0.5) majorVs
        w1 = p2 (fst (unp2 (head vs)), 0.0)
        wn = p2 (fst (unp2 (last vs)), 0.0)

    area4 :: Diagram B
    area4 = t # fc yellow # opacity 0.1
      where
        t :: Diagram B
        t = strokeLocT $ mapLoc Trail u
        u :: Located (Trail' Loop V2 Double)
        u = (fromVertices ([w1] ++ vs ++ [wn]) # closeLine) `at` ((xR - offset - 0.5) ^& 0)
        vs = fmap (scaleY 20.0) $ filter (\x -> fst (unp2 x) > xR - 0.5) majorVs
        w1 = p2 (fst (unp2 (head vs)), 0.0)
        wn = p2 (fst (unp2 (last vs)), 0.0)

    cs = map mkCircle $ map p2 $ [ (xL - 0.5 - offset, 0.0)
                                 , (xL - 0.5 - offset, skala * xMajL)
                                 , (xM - 0.5 - offset, skala * xMinM)
                                 , (xM - 0.5 - offset, skala * xMajM)
                                 , (xR - 0.5 - offset, 0.0)
                                 , (xR - 0.5 - offset, skala * xMajR)
                                 , (0.0, 0.0)
                                 , (ub - offset, 0.0)
                                 ]

    (xL, xR, xM, _, _, _, _, _, _) = xLRMc n p

    xMinM = minorizingFn n p (xM - 0.5)
    xMajL = majorizingFn n p (xL - 0.5)
    xMajR = majorizingFn n p (xR - 0.5)
    xMajM = majorizingFn n p (xM - 0.5)
    offset = minimum (map fst majors)

    mkCircle q = circle 0.1 # moveTo q

Shimuuar commented 4 months ago

I wrote property test (branch binomial-test in this repo) to check that binomial really generates samples from binomial distribution. It passes test for n<=10 but starts failing for larger n. I didn't looked in details whether generator or test is to blame.

idontgetoutmuch commented 4 months ago

That rather suggests the implementation:

    -- Threshold for preferring the BINV algorithm / inverse cdf
    -- logic. The paper suggests 10, Ranlib uses 30, R uses 30, Rust uses
    -- 10 and GSL uses 14.
    bInvThreshold :: Double
    bInvThreshold = 10

I will go and take a look.

idontgetoutmuch commented 4 months ago

I did a chi-squared test by hand for the values that quick check gave and agree that we can reject the null hypothesis :-(

Shimuuar commented 4 months ago

Yes it's quite visibly off for N=27, p=0.4. Black dots is sampler's output and blue is expectation

For N=8, p=0.4 agreement is perfect

Shimuuar commented 4 months ago

Oh. And another plot which could be useful for you. Ratio of sampler's probability vs expectation for same parameters as above

idontgetoutmuch commented 4 months ago

I just tried a reference implementation in C and the chi squared checks out so I must have transcribed something incorrectly. I will look tomorrow at my code.

idontgetoutmuch commented 3 months ago

[nix-shell:~/mwc-random]$ cabal bench --benchmark-options="--pattern beta"
cabal bench --benchmark-options="--pattern beta"
Build profile: -w ghc-9.4.8 -O1
In order, the following will be built (use -v for more details):
 - mwc-random-0.15.1.0 (bench:mwc-bench) (first run)
Preprocessing benchmark 'mwc-bench' for mwc-random-0.15.1.0..
Building benchmark 'mwc-bench' for mwc-random-0.15.1.0..
Running 1 benchmarks...
Benchmark mwc-bench: RUNNING...
All
  mwc
    D
      beta binomial 10:        OK (1.55s)
        1.50 ms ± 108 μs
      beta binomial 100:       OK (2.13s)
        2.08 ms ± 128 μs
      beta binomial table 10:  OK (1.54s)
        49.4 ms ± 4.2 ms
      beta binomial table 100: OK (2.95s)
        196  ms ±  15 ms

All 4 tests passed (8.16s)
Benchmark mwc-bench: FINISH

idontgetoutmuch commented 3 months ago

@Shimuuar I think it is now ready to merge but I have 2 questions:

I added some benchmark tests to test the table method for beta binomial against the new method (actually very old from the 80s method). Maybe you want these in a different group or something?
I seem to have added your prop tests to the PR. This seems ok but would you like me to do anything about it?

Shimuuar commented 3 months ago

Thanks!

At this point benchmarks doesn't matter. It seems there're several opportunities for microoptimizations. So it's likely I'll need to tweak benchmarks anyway
Nothing. It's fine

Also I forgot about program that draws diagram. If you want to add to repository program which draws diagram for algorithm just drop it into (nonexitent) docsdirectory with sort comment explaining what does it do

idontgetoutmuch commented 3 months ago

@Shimuuar I don't understand why we get these bounds warnings but of course with the suggested bounds, some of the builds fail.

Shimuuar commented 3 months ago

Thank you!

This PR could be merged. I'm going to make few documentation tweaks and some optimization work. But that could be done separately

haskell / mwc-random

Binomial Sampler #90