haskell / random

Random number library
Other
53 stars 50 forks source link

Add an API to serialise/deserialise to/from disk #123

Open adinapoli opened 2 years ago

adinapoli commented 2 years ago

Up until random-1.1 a barebone (and potentially brittle) way to serialise and deserialise a StdGen would have been to use show and read, however random-1.2.0 removed Read (for good reasons) which means this is not possible anymore. As a consequence, writing things like a Serialize instance for StdGen is not possible at the moment.

Technically speaking one can work around this by using the .Internal module and simply use seedSMGen' and unseedSMGen on the underlying SMGen, but it feels wrong to use the Internal module and to rely on the concrete implementation of StdGen.

In a nutshell, it would be nice to have two functions as part of the API similar to the following:

toSeedGamma :: StdGen -> (Word64, Word64)
fromSeedGamma :: (Word64, Word64) -> StdGen

Thanks!

lehins commented 2 years ago

This is a reasonable request and in fact I wanted it myself for some time. However, I think it would be better to provide a general interface for this, in case other PRNGs need such functionality. Maybe something along the lines of:

class RandomGen g where
  data Seed g :: Type
  toSeed :: g -> Seed g
  fromSeed :: Seed g -> g
  ...

this would allow for StdGen implementation to be

instance RandomGen StdGen where
  data Seed StdGen = StdGenSeed !Word64 !Word64
  toSeed (StdGen smGen) = uncurry StdGenSeed $ SM.unseedSMGen smGen
  fromSeed (StdGenSeed seed gamma) = StdGen $ SM.seedSMGen seed gamma
  ...

Thoughts?

adinapoli commented 2 years ago

Yes, I think that something along those lines should work, thanks!

adamgundry commented 2 years ago

What do you actually gain from adding this to RandomGen, i.e. what is the conceptual difference between g and Seed g? I suppose it would make it possible to write code that is polymorphic in e.g. a Serialize (Seed g) constraint for some Serialize class, but you could just as well use Serialize g directly.

EDIT: I suppose it could be useful if instead of an associated data family, toSeed/fromSeed actually converted to some kind of primitive representation (e.g. some flavour of byte string).

lehins commented 2 years ago

@adamgundry You are right, the suggested interface is not powerful enough to be useful.

It would be nice to have general ability to initialize any PRNG from a seed, say if we were to provide some ability to draw entropy from the system in a form of a ByteString. For that we would need information from g on how many bytes it needs.

Another thing we'd like, as this ticket suggest, is the ability to serialize any g to file and back. However, asking for just a string of bytes is not sufficient in my books, we need some type safety. How about this for an addition to the interface:

newtype BytesN (n :: Nat) = BytesN ShortByteString

toBytesN :: forall n. KnownNat n => ShortByteString -> Maybe (BytesN n)
fromBytesN :: BytesN n -> ShortByteString

class RandomGen g where
  type SeedSize g :: Nat
  toSeed :: g -> BytesN (SeedSize g)
  fromSeed :: BytesN (SeedSize g) -> g

This would allow serialization libraries to provide instances for any PRNG regardless of the underlying representation. But most importantly it would allow us to make it opt in and off by default, making it backwards compatible:

class RandomGen g where
  type SeedSize g = TypeError (ShowType g :<>: Text " doesn't support saving seeds")
  toSeed :: g -> BytesN (SeedSize g)
  toSeed _ = error "Impossible: Not supported"
  fromSeed :: BytesN (SeedSize g) -> g
  fromSeed _ = error "Impossible: Not supported"

which would produce a type error on toSeed/fromSeed instead of some ugly runtime error.

lehins commented 7 months ago

For anyone interested in this functionality there is an implementation in #162 that is still lacking some tests, but already works quite nicely. Here is an example function that should depict the power of this new SeedGen interface quite nicely:

https://github.com/haskell/random/blob/5fb946b05849f7f9c3a2c2ef9e92b48d07bd0590/src/System/Random/Seed.hs#L245-L251