tweag / monad-bayes

A library for probabilistic programming in Haskell.
MIT License
408 stars 62 forks source link

Prim monad #159

Closed idontgetoutmuch closed 2 years ago

netlify[bot] commented 2 years ago

Deploy Preview for monad-bayes canceled.

Name Link
Latest commit 3d796cccb5e60ac96205311987cb4d00c635fd71
Latest deploy log https://app.netlify.com/sites/monad-bayes/deploys/62dff2e39f48c9000903049c
idontgetoutmuch commented 2 years ago
  • explanation in the docs of how to use the new functions would be great (I'm also happy to have a go at that, but it would be nice to do it as part of this PR). Included in that would be examples of using different RNGs, and a comment on what to use for what.

There is some way of getting code in the documentation to run. Do you know how to do this?

reubenharry commented 2 years ago

@idontgetoutmuch One thing it would be nice to retain is sampleIO (which I might subsequently rename to sampler, which uses IO to create a random seed. I pushed an update with the following, lmk if it looks right to you:

sampleIO :: Sampler (IOGenM StdGen) IO a -> IO a
sampleIO x = initStdGen >>= newIOGenM >>= sampleWith x
reubenharry commented 2 years ago

@idontgetoutmuch I'm seeing a pretty large slowdown in the benchmarks for the PrimMonad branch. I think we should definitely debug that before proceeding with this branch.

idontgetoutmuch commented 2 years ago

@idontgetoutmuch I'm seeing a pretty large slowdown in the benchmarks for the PrimMonad branch. I think we should definitely debug that before proceeding with this branch.

Huh - well that is very irksome - it really should speed things up

reubenharry commented 2 years ago

Yeah, I imagine there's just a bug, or perhaps something needs to be specialized. I don't see a huge difference between the raw samplers (just timing the sampling of 100000 samples), so it's perhaps a function of changes in the benchmarking code, or elsewhere.

reubenharry commented 2 years ago

If I do


import Text.Printf
import System.CPUTime

time :: IO t -> IO t
time a = do
    start <- getCPUTime
    v <- a
    end   <- getCPUTime
    let diff = (fromIntegral (end - start)) / (10^12)
    printf "Computation time: %0.3f sec\n" (diff :: Double)
    return v

time $ sampleIOfixed $ void $ replicateM 100000 random

I get

Computation time: 0.134 sec

which is comparable to the main branch.

idontgetoutmuch commented 2 years ago

Yeah, I imagine there's just a bug, or perhaps something needs to be specialized. I don't see a huge difference between the raw samplers (just timing the sampling of 100000 samples), so it's perhaps a function of changes in the benchmarking code, or elsewhere.

I think it could be the way I set up benchmarking. I am just about to benchmark the linear regression example you came up with just using +RTS -s.

idontgetoutmuch commented 2 years ago
git branch
* master

  INIT    time    0.000s  (  0.002s elapsed)
  MUT     time    0.975s  (  0.985s elapsed)
  GC      time    0.173s  (  0.187s elapsed)
  EXIT    time    0.000s  (  0.004s elapsed)
  Total   time    1.148s  (  1.179s elapsed)

git branch
* PrimMonad

  INIT    time    0.000s  (  0.003s elapsed)
  MUT     time    0.975s  (  0.988s elapsed)
  GC      time    0.173s  (  0.190s elapsed)
  EXIT    time    0.000s  (  0.012s elapsed)
  Total   time    1.148s  (  1.193s elapsed)
idontgetoutmuch commented 2 years ago

So I think it must be something to do with the ST monad which I think is used in the performance tests. I won't be able to look further until at least Friday 22 July.

{-# LANGUAGE BlockArguments      #-}
{-# OPTIONS_GHC -Wall            #-}

module Main(main) where

import Control.Monad.Reader

import Control.Monad.Bayes.Class
import Control.Monad.Bayes.Sampler
import Control.Monad.Bayes.Traced
import Control.Monad.Bayes.Weighted

paramPriorRegression :: MonadSample m => m (Double, Double, Double)
paramPriorRegression = do
    slope <- normal 0 2
    intercept <- normal 0 2
    noise <- gamma 1 1 -- 4 4
    return (slope, intercept, noise)

regression :: (MonadInfer m) => [Double] -> [Double] -> m (Double, Double, Double)
regression xs ys = do
    (slope, intercept, noise) <- paramPriorRegression
    _ <- forM (zip xs ys) \(x, y) -> factor $ normalPdf (slope * x + intercept) (sqrt noise) y
    return (slope, intercept, noise)

syntheticData :: MonadSample m => Int -> m [(Double, Double)]
syntheticData n = do
  let xs = map fromIntegral [1 .. n]
      ys = map (3*) xs
  eps <- replicateM n $ normal 0 0.1
  let zs = zipWith (+) ys eps
  return $ zip xs zs

main :: IO ()
main = do
  xys <- sampleIO (syntheticData 10)
  let (xs, ys) = unzip xys
  print xs
  print ys
  putStrLn "\n"

  mhRunsRegression <- sampleIO $ prior $ mh 100000 $ regression xs ys
  let ss = map (\(x, _, _) -> x) mhRunsRegression
  let is = map (\(_, y, _) -> y) mhRunsRegression
  let ns = map (\(_, _, z) -> z) mhRunsRegression
  print $ (/ 50000.0) $ Prelude.sum $ drop 50000 ss
  print $ (/ 50000.0) $ Prelude.sum $ drop 50000 is
  print $ (/ 50000.0) $ Prelude.sum $ drop 50000 ns
executable perftestmain
  main-is:            PerfTestMain.hs
  hs-source-dirs:     .
  other-modules:
  build-depends:
      base,
      monad-bayes,
      mtl,
      mwc-random,
      random