`fromListMay` and its relatives might theoretically error due to rounding errors

Flupp commented 3 months ago

fromListMay takes Rationals as input but uses Double for drawing a random number. This may theoretically result in errors. For example consider drawing weighted from [('a', w)] where w = let i = 55 in (2^i - 1) / 2^i. The (single) weight w is nearly one and when converted to Double it is rounded up to exactly one. The Double value is used as an upper bound for drawing a random number. So, theoretically, exactly one could be drawn. For choosing from the list, the drawn number is converted back to Rational. The code ends up comparing w < 1, which is True. Because of that, head will be applied to an empty list.

Note: The problem also occurs with less crazy weights, e.g., 1 / 5: let x = 1 / 5 in x < toRational (fromRational x :: Double) is True.

You can easily show that the error might actually occur by replacing the random draw by returning the upper bound:

fromListMay' :: (MonadRandom m) => [(a, Rational)] -> m (Maybe a)
fromListMay' xs = do
  let s    = fromRational (sum (map snd xs)) :: Double
      cums = scanl1 (\ ~(_,q) ~(y,s') -> (y, s'+q)) xs
  case s <= 0 of
    True -> return Nothing
    _    -> do
      p <- liftM toRational $ return s  -- <- hard-wired draw result
      return . Just . fst . head . dropWhile ((< p) . snd) $ cums

-- >>> evalRand (fromListMay' [('a', let i = 55 in (2^i - 1) / 2^i)]) (mkStdGen 0)
-- Prelude.head: empty list

-- >>> evalRand (fromListMay' [('a', 1 / 5)]) (mkStdGen 0)
-- Prelude.head: empty list

-- >>> evalRand (fromListMay' [('a', 1 / 10), ('b', 1 / 10)]) (mkStdGen 0)
-- Prelude.head: empty list

-- Just to show that it might also work:
-- >>> evalRand (fromListMay' [('a', 1)]) (mkStdGen 0)
-- Just 'a'

Even worse: Considering the “Floating point number caveats” section in the random package, even values out of the bounds might be drawn.

All in all, the above problem might cause a program using fromListMay or its friends to crash literally randomly.

Unfortunately I have no clean idea to resolve the problem. The only idea I have is to redraw in case of a value out of bounds being drawn. Then, however, I fear there are edge cases that might result in endless loops.

byorgey commented 3 months ago

Thanks for the report! Indeed, I'm not sure why those functions have an interface in terms of Rational but then convert to Double internally... seems like a recipe for trouble.

Another option could be to change those functions' types to use Double instead of Rational which would at least be more honest about the kind of precision available. But that would be a breaking change, and it wouldn't completely solve the issue.

Flupp commented 3 months ago

I have two further ideas:

One could simply clamp the drawn number to the Rational bounds. Of course, this slightly skews the distribution, but for the pure rounding error problem, this is probably practically irrelevant; it’s basically yet another rounding error. :D For the second problem, where the drawn number might be out of the Double bounds, the skewing might be larger, however, there the distribution is probably questionable in the first place anyways.
Instead of using tricky bounds for getRandomR, one could normalize the Rationals by dividing each of them by the sum. Considering the mentioned floating point caveats (and after a quick look in the source code), I think getRandomR should behave benign when simply using (0, 1) as bounds. (Unfortunately, AFAICS, there is no equivalent for uniformDouble01M in MonadRandom.)

Regarding switching to Double in the first place: Besides being an interface change, which is inherently unfavorable, I agree that changing the interface type to Double would probably be cleanest. Note that it is then important that the summing for s and the summing for cums happens in the same order, otherwise, due to intermediate rounding errors, one might again end up with different values for s and the last element of cums. However, I guess this is probably the case although the order of summing by sum is not defined in its documentation.

Flupp commented 2 months ago

… yet another idea:

Do not draw a Double in the first place. Internally, getRandomR uses uniformDouble01M, which in turn uses uniformWord64, which (AFAICS) is also what is used when drawing a Word64 using getRandom. Hence, one could immediately draw this Word64 and use this to create the Rational:
```
 let s    = sum (map snd xs)
```
```
     w <- getRandom
     let p = s * toRational (w :: Word64) / toRational (maxBound :: Word64)
```
This would also guarantee staying withing the bounds without the quirks related to getRandomR for Double.

Of course, still, the interface suggests more precision (in fact, arbitrary precision) while internally only 64 bits are used. Hence, a value with a weight smaller than 1 / (maxBound :: Word64) might never be drawn by fromListMay. But this is virtually impossible anyways.

byorgey commented 2 months ago

@Flupp what do you think of https://github.com/byorgey/MonadRandom/commit/daa98f636bb677aa2550ab6321f10ab69c7a6066 ?

Flupp commented 2 months ago

Looks good to me.

Note: I do not know about your versioning scheme, but note that, given some internal PRNG state, this might change the result of fromListMay' and functions based on that. The successor PRNG state does probably not change since we draw the same Word64 like the previous solution did internally, but I am not 100%.

byorgey commented 2 months ago

Hmm, that's a good point. I will try to do a few spot checks to see whether it drastically changes the behavior of fromListMay' or not. In theory it should not change much; assuming that is correct, in this case I think it's worth just doing a minor version bump (avoiding all the ecosystem churn attendant on a major bump) even though technically a major version bump would be required if the output of the function has changed for some inputs.

byorgey commented 2 months ago

Took me a minute to track this down, but it turns out that random Doubles are chosen by calculating u / maxBound where u is a uniform Word64, then subtracting from 1! So at first, the values generated by fromListMay and friends were completely different than the old ones (even though the ending generator state was still the same). But once I figured that out it was an easy fix; the new fromListMay now generates exactly the same values and updated generator state as the previous versions, for all the cases I tried. In theory, the new implementation of fromListMay and friends could still generate a different output than before in certain cases, but the probability of hitting such a case seems vanishingly small.

Flupp commented 2 months ago

Yeah, sorry, I forgot to mention the “subtract from 1” thing. I already stumbled upon this myself, and this actually has its own issues (but only when using floating point; see haskell/random#166).

Anyways, I am not sure if doing this change in a minor version is a good idea. There might be people relying on generating some fixed data from a fixed seed for the PRNG. Problems might be especially subtle or might remain unnoticed for a long time if a different behavior only occurs in some very rare cases.

In any case, I suggest a release note about this.

Flupp commented 3 weeks ago

I learned that there is some discussion about the versioning.

Let me propose another option: Doing two different changes:

A minor version for just fixing the potential crash without changing any other behavior.
A major version for the proper fix.

I think the first could easily be achieved by simply detecting the error case and returning the last element then (which was just skipped because of rounding issues):

      p <- liftM toRational $ getRandomR (0, s)
      return . Just . fst $ case dropWhile ((< p) . snd) cums of
        x : _ -> x
        [] -> last xs

(Code untested; it’s just a sketch.)

For the second one, you do not have to consider backwards compatibility then. You don’t even need to do the “subtracting from 1“ (see https://github.com/byorgey/MonadRandom/issues/53#issuecomment-2294862625); this might then even help users to notice that actually something changed, so they cannot accidentally assume that the distribution did not change, because it changes only ever so slightly when doing the “subtracting from 1“. Also, you can delay this change to collect some more breaking changes before bumping a major version.

byorgey commented 2 weeks ago

Released MonadRandom-0.6.1 to Hackage with this fix.

byorgey / MonadRandom

`fromListMay` and its relatives might theoretically error due to rounding errors #53