python / cpython

The Python programming language
https://www.python.org
Other
63.78k stars 30.55k forks source link

random.jumpahead and PRNG sequence independence #54025

Closed 10983954-107f-4ff9-8350-7ae52ca42c4b closed 14 years ago

10983954-107f-4ff9-8350-7ae52ca42c4b commented 14 years ago
BPO 9816
Nosy @rhettinger
Files
  • random_test.py: Test case to find first agreement in PRNG.
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields: ```python assignee = 'https://github.com/rhettinger' closed_at = created_at = labels = ['type-bug', 'library'] title = 'random.jumpahead and PRNG sequence independence' updated_at = user = 'https://bugs.python.org/JosephSchaeffer' ``` bugs.python.org fields: ```python activity = actor = 'cmcqueen1975' assignee = 'rhettinger' closed = True closed_date = closer = 'rhettinger' components = ['Library (Lib)'] creation = creator = 'Joseph.Schaeffer' dependencies = [] files = ['18816'] hgrepos = [] issue_num = 9816 keywords = [] message_count = 7.0 messages = ['115985', '115990', '115998', '116066', '202802', '202803', '202804'] nosy_count = 3.0 nosy_names = ['rhettinger', 'cmcqueen1975', 'Joseph.Schaeffer'] pr_nums = [] priority = 'high' resolution = 'fixed' stage = None status = 'closed' superseder = None type = 'behavior' url = 'https://bugs.python.org/issue9816' versions = ['Python 2.7'] ```

    10983954-107f-4ff9-8350-7ae52ca42c4b commented 14 years ago

    Reading the Python 2.6 docs, it appeared that using random.jumpahead would allow the initialization of several generators with the same seed but having much different internal states. While the resulting PRNG appear to have different internal states, the produced random numbers [via .random()] are exactly the same after a small initial segment.

    Attached is some example code which shows the first point at which they all agree - in my testing (Mac OS X, Python versions 2.5, 2.6, 2.7) the generated numbers all agreed on the 12th number generated. For smaller differences in jumpahead it was noticeable a lot earlier - n=1,2 differ only in the first sample from each.

    The internal state of the PRNGs is indeed different even after the successive sampling, so it may be that this is intended - however if so the docs may cause confusion: my particular case was where I need random numbers for a stochastic markov process and in addition needed many such generators [one for each trajectory] and was hoping to use random.jumpahead to have indepedent PRNG's without having to generate [and prove] my own independent set of seeds. Thus having a long sequence of non-independent random numbers near the initial start condition causes random.jumpahead to be unusable for my situation.

    It appears that Python 3.1 removed random.jumpahead - if so, it may be useful to note in the 2.6 docs why this was / the issues with random.jumpahead: reading how it changed after 2.3 made it sound like it was exactly what I wanted.

    Possible cause: I suspect the issue may be related to how a Mersenne Twister algorithm can take a while to recover from poor seeding (excessive 0's), but do not know enough to explore that idea.

    rhettinger commented 14 years ago

    Thanks for the report. Something does appear to be broken. When the states are different, the random numbers should be different. Am looking in to it.

    In the mean time, I recommend against using jumpahead() with MT. It is better to separately seed three different generators and rely on the huge period of MT to keep the sequences from overlapping.

    If you do use jumpahead(), it is intended to be supplied with large values of n (not 1, 11, or 21).

    The function/method was removed in 3.x because it was an API defect. The jumpahead concept as originally intended (move ahead n-steps) was something that could really only work with a generator like Wichmann-Hill. Newer and more advanced generators aren't usually amenable to direct computation of a state that is n-steps forward.

    rhettinger commented 14 years ago

    I see the problem now. Random.jumpahead(n) does a very poor job of shuffling MT's state when n is small. The first few numbers of the state are different but some of the later ones are not. When random() crawls across parts of the state that are identical, it produces identical output. Later when has wrapped around, the random() calls diverge again.

    Fixed by salting the jumpahead value. See r84665.

    10983954-107f-4ff9-8350-7ae52ca42c4b commented 14 years ago

    Thanks for looking into it! I'm glad that issue will be fixed, as at least one website was actually recommending using .jumpahead(i) for i in 1..100 for independent seed.

    I suspect in my use case I'll want to continue my previous methods; I work with stochastic Markov processes and I need to seed a large number (10k+) of generators - one per trajectory - and also have the requirement of needing a deterministic PRNG. So having a single Mersenne Twister seed plus salting values that worked with .jumpahead would be a simpler representation; my previous code in C did basically that with a LCG to create those seeding values for the Mersenne Twister. So that's roughly equivalent [I think?] to the fixed random.jumpahead.

    Thanks again!

    f1e1d87a-4340-4801-8770-9e5b7119ec5a commented 11 years ago

    I notice that the C++11 library has a discard() member function for its random generators, which is effectively a jumpahead operation. It seems that the C++11 library has implemented discard() for the Mersene Twister generator. If jumpahead() is technically possible for MT, can it be added back into the Python library?

    f1e1d87a-4340-4801-8770-9e5b7119ec5a commented 11 years ago

    C++11 Mersenne Twister discard() member function: http://www.cplusplus.com/reference/random/mersenne_twister_engine/discard/

    f1e1d87a-4340-4801-8770-9e5b7119ec5a commented 11 years ago

    StackOverflow question about Mersenne Twister jumpahead: http://stackoverflow.com/q/4184478/60075

    which refers to this: http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/JUMP/index.html