haskell / random

Random number library
Other
53 stars 50 forks source link

Instances for `Ratio` #75

Open lehins opened 4 years ago

lehins commented 4 years ago

There is currently no instances for Ratio data type that can give us random values

I can see a reasonable Uniform instance for integral types, which excludes the zero denominator:

instance (Integral a, Uniform a) => Uniform (Ratio a) where
  uniformM g = do
    n <- uniformM g
    let notZero = do
          d <- uniformM g
          if d == 0 then notZero else pure d
    (n %) <$> notZero

An efficient version for UniformRange needs some thought.

Note that neither Uniform, nor UniformRange would be able to generate Rational. This means that we can attempt and cook up such instance for Random

Shimuuar commented 4 years ago

It isn't uniform. Let take for sake of simpliciti Ratio Word3. In this case we will generate with equal probability each value in the list: liftA2 (%) [0..7] [1..7]. They cover range [0, 7] but only 5 values are in [3.5 .. 7] range.

lehins commented 4 years ago

@Shimuuar It is uniform as far as the Ratio Word3 is concerned. It is just as uniform as selecting uniformly from data RGB = Red | Green | Blue data type with equal probability. We generate all possible values that a data type can have with uniform probability.

I understand that with respect to the actual rational numbers over the infinite space, this is not uniform, but the type is not infinite. That means we need to decide. Is Uniform about math and real values, or it is about the uniformity of types? I am fine with either choice, by the way, because we do still have Random class that allows us to deal with this issue.

Shimuuar commented 4 years ago

When it come to numbers uniform usually understood as distributed on number line with uniform density. For Ints & Words we work on uniform grid so notions of uniform density and generate every inhabitant with equal probability coincide. For floats we try to approximate density and abandon every inhabitant with equal probability.

But this instance is not uniform in either sense! It generates 1 7 times out of 56, and 7 only 1 of 56.

Bodigrim commented 4 years ago

The problem is that Ratio Word3 has not 8*(8-1)=56, but only 36 inhabitants: 0 and 1 appear 7 times of 56, 1/2 and 2 - thrice; 1/3, 2/3, 3/2 and 3 - twice. It is certainly non-Uniform inhabitants-wise.

lehins commented 4 years ago

I see what you mean. So my suggested implementation is bogus. That's fine. Question is can we come up with a sensible implementation(s) for the Ratio type?

Bodigrim commented 4 years ago

For Ratio Word one can generate n and d; if gcd n d == 1 then return n % d, otherwise reject them and generate new ones. But things may become more complicated for Ratio Int because of signs and minBound /= negate maxBound, need to think more about it.

In my experience Ratio Int / Ratio Word are of limited utility, because it is too easy to get an overflow in denominator.

Shimuuar commented 4 years ago

Also I think that approach "generate every inhabitant with equal probability" is just a giant footgun. I don't think anyone would expect such semantics.

lehins commented 4 years ago

I don't think anyone would expect such semantics.

2 % 4 and 1 % 2 are equal inhabitants as far as Ratio is concerned, because values are normalized. So here we are really talking about generating all valid inhabitants with equal probability.

@Shimuuar What would you expect if someone asked you: Generate a random rational number for me please?

lehins commented 4 years ago

@Bodigrim I remember helping out a friend with this and he came up with this validation for Ratio: https://github.com/NorfairKing/validity/blob/cfe18e509a7c3298a15751c5889254cf0650cbc2/validity/src/Data/Validity.hs#L500-L508

CC @NorfairKing

NorfairKing commented 4 years ago

@lehins For the record, this is how genvalidity generates Ratios: https://github.com/NorfairKing/validity/blob/cfe18e509a7c3298a15751c5889254cf0650cbc2/genvalidity/src/Data/GenValidity.hs#L718-L728

    genValid = (do
      n <- genValid
      d <- (genValid `suchThat` (> 0))
      pure $ n :% d) `suchThat` isValid

You can probably generate the denominator more easily if you assume Num and use abs.

Shimuuar commented 4 years ago

@Shimuuar What would you expect if someone asked you: Generate a random rational number for me please?

Well I certainly don't expect such questions. But it's impossible to speak about generating random numbers without specifying distribution first. So we either know what we're talking about or person asking such question deserve some ridicule

But then if we have Uniform why nor UniformRange? How UniformRange should work then? It it uses same approach: "generate every inhabitant with equal probability" why then it works differently from Float/Double. They're just a different subset of rationals after all.

qnikst commented 4 years ago

FWIW if I'd ask a library to generate random Ratio in the place where I don't care much about distribution, I would expect that it has a uniform distribution over requested range, and I would consider non-uniform distribution as a bug.

lehins commented 4 years ago

Well I certainly don't expect such questions.

Lol. I am asking you one right now ;) Basically how would you solve it? We need uniform distribution for Ratio data type

Shimuuar commented 4 years ago

"Don't try to be clever just use doubles". It's question about building unform distributions between two really big numbers (0, maxBound :: Word64) for example. It's very hard problem since it requiring carefully weighting numbers on very uneven grid. I'm not sure it's even possible without excessive lookup tables

lehins commented 4 years ago

"Don't try to be clever just use doubles"

lol. I am already submitting a PR to ghc to remove Ratio from base :D

It is a hard problem, but I don't think it is an unsolvable one. We definitely don't have to decide on it now, so let's think about it and see what we can come up with.

Shimuuar commented 4 years ago

lol. I am already submitting a PR to ghc to remove Ratio from base :D

I'm quite serious BTW. Generating doubles, casting them to Rationals and working with them in case you need to work with them is quite viable approach. Another one is to generate uniformM % maxBound but then one have to convert to Ratio Integer since basically any operation could overflow denominator.

Bodigrim commented 4 years ago

Ratio a, when a has finite precision, is a very dangerous and unlawful beast. For example, (-11 % 12 :: Ratio Int8) > 13 % 12. I would rather discourage people from using such types, and I bear no interest in providing utilities to work with them.

On the other hand, if we decide to give a green light to new instances of Random, we can define instance Random Rational just applying toRational to Doubles in [0, 1].

lehins commented 4 years ago

Ratio a, when a has finite precision, is a very dangerous and unlawful beast.

Sounds reasonable.

On the other hand, if we decide to give a green light to new instances of Random, we can define instance Random Rational just applying toRational to Doubles in [0, 1].

I like it.

NorfairKing commented 4 years ago

On the other hand, if we decide to give a green light to new instances of Random, we can define instance Random Rational just applying toRational to Doubles in [0, 1].

This would be inappropriate in testing, but there's no reason why random generators should be used for testing per-se. As long as this is documented well, I don't really mind, but we might as well just have some functions then, instead of an instance.

Shimuuar commented 4 years ago

On the other hand, if we decide to give a green light to new instances of Random, we can define instance Random Rational just applying toRational to Doubles in [0, 1].

That sounds as reasonable choice for Random instance. BTW it's not necessary to involve Doubles we can just use:

n <- (fromIntegral :: Word64 -> Integer) <$> unform
return $ n % (2^64-1)