python / cpython

The Python programming language
https://www.python.org
Other
63.48k stars 30.4k forks source link

random.randrange() biased output #68162

Closed 0d49e77c-02f3-4595-8d98-ee8fd7001a64 closed 9 years ago

0d49e77c-02f3-4595-8d98-ee8fd7001a64 commented 9 years ago
BPO 23974
Nosy @smontanaro, @rhettinger, @mdickinson, @tiran, @serhiy-storchaka
Files
  • issue23974.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields: ```python assignee = None closed_at = created_at = labels = ['type-bug', 'library'] title = 'random.randrange() biased output' updated_at = user = 'https://bugs.python.org/gurnec' ``` bugs.python.org fields: ```python activity = actor = 'gurnec' assignee = 'none' closed = True closed_date = closer = 'rhettinger' components = ['Library (Lib)'] creation = creator = 'gurnec' dependencies = [] files = ['39845'] hgrepos = [] issue_num = 23974 keywords = ['patch'] message_count = 11.0 messages = ['241261', '241266', '241268', '241315', '241322', '241696', '241863', '241866', '246091', '246130', '246179'] nosy_count = 6.0 nosy_names = ['skip.montanaro', 'rhettinger', 'mark.dickinson', 'christian.heimes', 'serhiy.storchaka', 'gurnec'] pr_nums = [] priority = 'normal' resolution = 'wont fix' stage = 'needs patch' status = 'closed' superseder = None type = 'behavior' url = 'https://bugs.python.org/issue23974' versions = ['Python 2.7'] ```

    0d49e77c-02f3-4595-8d98-ee8fd7001a64 commented 9 years ago

    Due to an optimization in random.randrange() only in Python 2, as the stop-start range approaches 2^53 the output becomes noticeably biased. This bug also affects random.SystemRandom.

    For example, counting the number of even ints in a set of 10^6 random ints each in the range [0, 2^52 * 2/3) produces normal results; about half are even:

    >>> sum(randrange(2**52 * 2//3) % 2 for i in xrange(1000000)) / 1000000.0
    0.499932

    Change the range to [0, 2^53 * 2/3), and you get a degenerate case where evens are two times more likely to occur than odds:

    >>> sum(randrange(2**53 * 2//3) % 2 for i in xrange(1000000)) / 1000000.0
    0.333339

    The issue occurs in three places inside randrange(), here's one:

    if istart >= _maxwidth:
        return self._randbelow(istart)
    return _int(self.random() * istart)

    _maxwidth is the max size of a double where every digit to the left of the decimal point can still be represented w/o loss of precision (2^53, where a double has 53 mantissa bits). With istart >= _maxwidth, _randbelow() behaves correctly. With istart \< _maxwidth, the rounding error in random() * istart begins to cause problems as istart approaches 2^53.

    Changing _maxwidth to be significantly less should (practically speaking anyways) fix this, although I'm open to suggestion on just how conservatively it should be set.

    smontanaro commented 9 years ago

    I'm completely unqualified to offer a concrete, expert opinion here, but it seems like defaulting _maxwidth to 1L\<\<52 should work. I have no idea of the resulting performance implications though.

    >>> for bpf in (45, 46, 47, 48, 49, 50, 51, 52, 53):
    ...   print bpf,
    ...   print sum(randrange(2**53 * 2//3, _maxwidth=1L<<bpf) % 2 for i in xrange(1000000)) / 1000000.0,
    ...   print sum(randrange(2**52 * 2//3, _maxwidth=1L<<bpf) % 2 for i in xrange(1000000)) / 1000000.0
    ... 
    45 0.499959 0.500144
    46 0.501622 0.500371
    47 0.500257 0.499692
    48 0.499567 0.499942
    49 0.499789 0.50028
    50 0.499882 0.500488
    51 0.500479 0.50006
    52 0.500553 0.500234
    53 0.33275 0.500296
    0d49e77c-02f3-4595-8d98-ee8fd7001a64 commented 9 years ago

    I shouldn't have called this a rounding error issue, that's not really what it is.

    A smaller example might help.

    If I'm given a random int, x, in the range [0, 12), and asked to produce from it a random int, y, in the range (0,8], I've got (at least?) two choices:

    1. y = x If x \< 8 Else fail
    2. y = f(x), where f maps values from [0, 12) -> (0,8]

    The problem with method 2 is you end up with a mapping like this: 0,1 -> 0 2 -> 1 3,4 -> 2 5 -> 3 6,7 -> 4 8 -> 5 9,10 -> 6 11 -> 7

    _randbelow() uses method 1 above. _int(self.random() * istart) is more like method 2.

    I chose 2^53 * 2/3 just because the bias was easy to demonstrate. There will always be some bias when stop-start % 2^53 != 0, but it might not manifest itself as easily as checking for evenness.

    Personally, I think 2^52 is still way too high as a cutoff point for using the (presumably faster, I didn't timeit) method 2, but I don't claim to be an expert here....

    mdickinson commented 9 years ago

    See also bpo-9025, which contains a fix for this exact problem in Python 3.

    mdickinson commented 9 years ago

    In the bpo-9025 discussion, reproducibility was a key concern. Though I note that despite the comments there, we *still* have no documented guarantees of reproducibility, so maybe it's safe to go ahead and change this. Raymond?

    IMO, the fix from bpo-9025 should be backported. Note that that fix fixes the issue completely: all outputs will occur with equal probability. That's under the unrealistic assumption of a perfect source of random bits/words, of course; but the key point is that randrange shouldn't introduce any new biases that aggravate existing ones in the source generator. Reducing _maxwidth would just reduce the size of the randrange bias without eliminating it completely: if we're going to make any change at all to the source, we should fix the problem once and for all.

    Another option that avoids breaking reproducibility would be to note the bias in the docs, and provide a doc recipe for an unbiased randrange, for those who need it.

    rhettinger commented 9 years ago

    we *still* have no documented guarantees of reproducibility, so maybe it's safe to go ahead and change this. Raymond?

    It is documented here: https://docs.python.org/3/library/random.html#notes-on-reproducibility

    The idea that is that algorithms (and the generated sequences) may change in between minor releases, but not in micro releases. And random() itself if more restricted (guaranteed to be the same across minor releases as well). The policy was new in Python 3. It was a liberalization of the implied policy in Python 2 that we didn't change the sequences at all (except for flat-out brokenness).

    Accordingly, the bpo-9025 debiasing was intentionally not backported so we won't break reproducibility and adversely affect performance of existing code.

    We could add a note as Mark suggests, but please keep it terse and affirmatively worded (perhaps something "See also: Recipe 31xx for a way to eliminate rounding biases in randrange()". The docs are not well served by being littered with Danger-signs and security warnings.

    tiran commented 9 years ago

    IMO it's not a security issue at all. If you have to care about security, you shouldn't use the random module at all. random.SystemRandom() merely uses a CPRNG as entropy source. But It also manipulates numbers in ways that may or may not be safe.

    Only os.getrandom() returns unmodified and unbiased random numbers -- iff the operating system provides a proper CPRNG.

    0d49e77c-02f3-4595-8d98-ee8fd7001a64 commented 9 years ago

    If you have to care about security, you shouldn't use the random module at all. random.SystemRandom() merely uses a CPRNG as entropy source. But It also manipulates numbers in ways that may or may not be safe.

    I must respectfully disagree with this. The current docs say:

    Use os.urandom() or SystemRandom if you require a cryptographically secure pseudo-random number generator.

    That's a pretty strong statement, and IMO it would lead most to believe that SystemRandom along with *all* of its member functions is safe to use for cryptographic purposes[1] (assuming of course that os.urandom() is also a safe CSPRNG).

    As a compromise, perhaps SystemRandom could provide its own randrange() with the bpo-9025 fix, while keeping random.randrange() unmodified to preserve the implied same-sequence rule.

    [1] I don't mean to imply that this bias bug necessarily is a cryptographic safety issue--it seems unlikely to me that it is one, however not being a cryptographer myself, I'd rather not draw any conclusions either way, and instead I'd prefer to err on the side of safety.

    0d49e77c-02f3-4595-8d98-ee8fd7001a64 commented 9 years ago

    There's been no activity on this issue in a few months.... The three options as I see it are:

    1. Fix it for both randrange and SystemRandom.randrange, breaking randrange's implied stability between minor versions.
    2. Fix it only for SystemRandom.randrange.
    3. Close it as wont fix (for performance reasons I'd assume?).

    Since I'm in favor of option 2, I've attached a simple patch which implements it. Here are some quick-and-dirty performance numbers showing the decrease in performance (3 tests of the original code followed by 3 of the patched code):

    $ python -m timeit -r 10 -s 'import random; s = random.SystemRandom(); r = 2**8' 's.randrange(r)'
    10000 loops, best of 10: 22.5 usec per loop
    $ python -m timeit -r 10 -s 'import random; s = random.SystemRandom(); r = 2**31' 's.randrange(r)'
    10000 loops, best of 10: 22.6 usec per loop
    $ python -m timeit -r 10 -s 'import random; s = random.SystemRandom(); r = 2**53 * 2//3' 's.randrange(r)'
    10000 loops, best of 10: 22.4 usec per loop
    
    $ python -m timeit -r 10 -s 'import random; s = random.SystemRandom(); r = 2**8' 's.randrange(r)'
    10000 loops, best of 10: 23.7 usec per loop
    $ python -m timeit -r 10 -s 'import random; s = random.SystemRandom(); r = 2**31' 's.randrange(r)'
    10000 loops, best of 10: 46.2 usec per loop
    $ python -m timeit -r 10 -s 'import random; s = random.SystemRandom(); r = 2**53 * 2//3' 's.randrange(r)'
    10000 loops, best of 10: 34.8 usec per loop

    The patch also includes a unit test (with a false negative rate of 1 in 8.5 * 10^-8: http://www.wolframalpha.com/input/?i=probability+of+417+or+fewer+successes+in+1000+trials+with+p%3D0.5).

    Any opinions on which of the three options should be taken?

    rhettinger commented 9 years ago

    I choose option 3, close as won't fix. The ship for 2.7 sailed a long time ago. The code for randrange() was designed by Tim Peters and has been in-place for a decade and half without causing suffering in the world. Also, changing the type to "behavior" from "security" -- that latter classification is quite a stretch.

    0d49e77c-02f3-4595-8d98-ee8fd7001a64 commented 9 years ago

    Option 3 of course wasn't my first choice (given how small the patch is and how minimal its potential negative impact), but it's certainly better than allowing an issue to linger in limbo.

    Thank you, all.