haskell-numerics / random-fu

A suite of Haskell libraries for representing, manipulating, and sampling random variables
42 stars 21 forks source link

Migration to 0.3 could use more documentation #83

Open ncfavier opened 2 years ago

ncfavier commented 2 years ago

I'm trying to port some code (https://github.com/lambdabot/lambdabot/pull/204) that previously used sample v in an IO context to random-fu 0.3 in order to compile on GHC 9. This doesn't work anymore because the MonadRandom IO instance is gone, so after digging around in the source code I eventually figured out I needed something like do g <- MWC.createSystemRandom; sampleFrom g v.

I think this could be made clearer in the documentation as it seems like a common usecase?

Also, I'm not sure what the difference would be between using MWC.createSystemRandom and newIOGenM =<< newStdGen (assuming this is a silly "get a random quote" application and not anything needing Serious Random Numbers™).

idontgetoutmuch commented 2 years ago

Hi @ncfavier - I am sorry about that - perhaps you would consider creating some documentation?

Also, I'm not sure what the difference would be between using MWC.createSystemRandom and newIOGenM =<< newStdGen (assuming this is a silly "get a random quote" application and not anything needing Serious Random Numbers™).

They use different PRNG algorithms (multiply with carry and splitmix). Both are "serious" pseudo random number generators in the sense that they will pass a lot of tests for randomness (of course they cannot by definition pass all tests for randomness). They are not suitable for cryptographic applications if that is what you mean by "serious".

freckletonj commented 2 years ago

Seconded, the new version killed a lot of code for me, and it's been rather painful to ascertain how to properly migrate.

idontgetoutmuch commented 2 years ago

@freckletonj I am very sorry about that. I had no idea that random-fu was so widely used. Please let me know if I can help. I can take a look at anything that needs converting. But you could always stay on 0.2 until your code is converted?

cc: @ncfavier here is a small example of how I use 0.3 with random 1.2:

{-# OPTIONS_GHC -Wall              #-}
{-# OPTIONS_GHC -Wno-type-defaults #-}

import           System.Random.Stateful
import           Data.Maybe (catMaybes)
import           Data.List (unfoldr)
import           Control.Monad.Reader
import qualified Data.Random as R
import           System.Random.MWC as MWC

resampleStratified :: ( StatefulGen g m
                      , MonadReader g m
                      , R.Distribution R.Uniform a
                      , Fractional a
                      , Ord a)
                   => [a] -> m [Int]
resampleStratified weights = do
  let bigN = length weights
  dithers <- replicateM bigN (R.sample $ R.uniform 0.0 1.0)
  let positions = map (/ (fromIntegral bigN)) $
                  zipWith (+) dithers (map fromIntegral [0 .. bigN - 1])
      cumulativeSum = scanl (+) 0.0 weights
      coalg (i, j) | i < bigN =
                       if (positions!!i) < (cumulativeSum!!j)
                       then Just (Just j, (i + 1, j))
                       else Just (Nothing, (i, j + 1))
                   | otherwise =
                       Nothing
  return $ catMaybes $ unfoldr coalg (0, 0)

main :: IO ()
main = do
  setStdGen (mkStdGen 42)
  g <- newStdGen
  stdGen <- newIOGenM g
  is <- runReaderT (resampleStratified [0.5 :: Double, 0.5]) stdGen
  print is
  monadicGen <- MWC.create
  js <- runReaderT (resampleStratified [0.2 :: Double, 0.8]) monadicGen
  print js
ncfavier commented 2 years ago

For information, there are 43 packages on Hackage that depend on random-fu.

Maybe we could add a function sampleIO v = do g <- newIOGenM =<< newStdGen; sampleFrom g v to restore the old behaviour?

freckletonj commented 2 years ago

Ah, and running an rvar is now:

(runReaderT (sampleReaderRVarT someRVar) mwc) :: IO Double

The docs are a little tough for me since:

But these 2 examples you gave, plus the above of how to run an RVar are helpful.

One question though is, it seems perhaps a misfeature to require functions to specify whether they're a Reader or State, when it'd be nice for the caller to choose. But I guess then the intent is to keep library-type functions all in RVar or Distrubution for as long as possible?

Anyways, I do love this library and it stands out to me as being one of the first libraries to look at if you need to deal with randomness. Thank you!

idontgetoutmuch commented 2 years ago

For information, there are 43 packages on Hackage that depend on random-fu.

Maybe we could add a function sampleIO v = do g <- newIOGenM =<< newStdGen; sampleFrom g v to restore the old behaviour?

I wasn't totally cavalier when I updated random-fu to use the latest random and I did look at the reverse dependencies and concluded that it would not have an impact. I can see that I was wrong and I will endeavour to avoid this in the future. I can look at each (reverse) dependency and see if I can help.

freckletonj commented 2 years ago

@idontgetoutmuch sorry to cause a fuss! I really appreciate this library and your work on it! I'd like to say that my pains were less from the update, and more from the documentation being misleading or absent, and not much clarity around what changed between versions. So, I hope you freely iterate this however you see fit, and, reverse-dependencies be damned!

sullyj3 commented 2 years ago

(assuming this is a silly "get a random quote" application and not anything needing Serious Random Numbers™)

I think this is a case where the 80-20 rule can be judiciously applied - pulling numbers out of my ass here, but I'd assume 80% of the time people want to use randomness, they don't care about determinism, and they're operating in either IO or MonadIO. They don't really want to bother with the state of the randomGen, or the details of how it gets threaded between computations, or extra readerTs, or what have you. If that's the case, it makes sense to support that use case with simple functions, few imports, and low complexity. The api changes going from 0.2 to 0.3 unfortunately seem to have complicated this use case somewhat.

I understand that maintaining this library happens on a volunteer basis, and I don't want to be too ungrateful. I also appreciate the work that you've put into this.

idontgetoutmuch commented 2 years ago

Hi @sullyj3 - thanks for the feedback and I fully agree with what you say about making the library easy to use. I have however what I think is a better plan to improve things. random-fu itself has long needed a complete re-write even without changing the interface.

I have however discovered that the maintainer / author of mwc-random has made all the sampling functions independent of the RNG. And monad-bayes provides a sampleIO which uses these to allow the user to write things like

syntheticData :: MonadSample m => Int -> m [(Double, Double)]
syntheticData n = do
  let xs = map fromIntegral [1 .. n]
      ys = map (3 *) xs
  eps <- replicateM n $ normal 0 0.1
  let zs = zipWith (+) ys eps
  return $ zip XS zs

and then "run" them via sampleIO (syntheticData 10) for example which uses random to provide the RNG (which is better than MWC).

My plan is to migrate any distributions that random-fu has but mwc-random does not and then write a migration document for random-fu users.

Even better is that monad-bayes allows you to do inference but that's a whole different story :-)

Also there is actually a team of use actively working on / using monad-bayes; in contrast I only work on random-fu when I have the time.

HTH

sullyj3 commented 2 years ago

Ok, that's encouraging. Thanks for laying that out for me. I look forward to seeing your plan come to fruition!

spearman commented 1 year ago

Is it possible anymore to use random-fu as a standalone library? It seems like you have to bring in MTL and System.Random

before:

import qualified Data.Random as Random
import qualified Data.Random.Distribution.Bernoulli as Bernoulli

main :: IO ()
main = do
  x <- Random.sample $ Bernoulli.Bernoulli (0.5 :: Double) :: IO Double
  print x

after:

import qualified System.Random.Stateful as Random.Stateful
import qualified Data.Random as Random
import qualified Data.Random.Distribution.Bernoulli as Bernoulli
import qualified Control.Monad.Reader as Reader

main :: IO ()
main = do
  stdgen <- Random.Stateful.newIOGenM =<< Random.Stateful.newStdGen
  x <- Reader.runReaderT (Random.sample $ Bernoulli.Bernoulli (0.5 :: Double)) stdgen
        :: IO Double
  print x

Is this the most minimal example with random-fu 0.3? I am also not sure if the examples are equivalent.

idontgetoutmuch commented 1 year ago

Is it possible anymore to use random-fu as a standalone library? It seems like you have to bring in MTL and System.Random

before:

import qualified Data.Random as Random
import qualified Data.Random.Distribution.Bernoulli as Bernoulli

main :: IO ()
main = do
  x <- Random.sample $ Bernoulli.Bernoulli (0.5 :: Double) :: IO Double
  print x

after:

import qualified System.Random.Stateful as Random.Stateful
import qualified Data.Random as Random
import qualified Data.Random.Distribution.Bernoulli as Bernoulli
import qualified Control.Monad.Reader as Reader

main :: IO ()
main = do
  stdgen <- Random.Stateful.newIOGenM =<< Random.Stateful.newStdGen
  x <- Reader.runReaderT (Random.sample $ Bernoulli.Bernoulli (0.5 :: Double)) stdgen
        :: IO Double
  print x

Is this the most minimal example with random-fu 0.3? I am also not sure if the examples are equivalent.

They look equivalent to me.

It might be possible to create and export helper functions from random-fu so that one doesn't have to import mtl and random. Maybe you can try and then create a PR?

I am going to use your example in the documentation. I hope you don't mind.

spearman commented 1 year ago

You can use the example in the documentation. I'm not sure that I will have time to submit a PR in the near future, if anyone else wants to that would be great.