pharo-project / pharo

Pharo is a dynamic reflective pure object-oriented language supporting live programming inspired by Smalltalk.
http://pharo.org
Other
1.19k stars 353 forks source link

atRandom & integer agnosticism #2484

Open Ducasse opened 5 years ago

Ducasse commented 5 years ago

Hello community.

Kindly consider:

| n i r |
n := ( 2 ** 256 ) - 1 .  "the largest 256-bit value"
i := ( 0 to: n ) .  "the interval 0 to 2**256-1"
"Produce three sets of computed test results."
3 timesRepeat: [
   Transcript show: ( r := i atRandom ) ; cr .
   Transcript show: ( r asByteArray hex ) ; cr . ]

According to comments in Pharo library code, atRandom fails for integer magnitudes greater than 2**30 . A reason for this limitation is not given, and it's not clear to a novice such as myself why the computed result of atRandom is permitted to sometimes be in conflict with the semantics of atRandom .

Can anyone provide a basic explanation of the rationale for a 'broken' atRandom ?

Alternatively, is this a known bug?

Thanks. dr

Ducasse commented 5 years ago

Here is the answer of NICE

Try it in Squeak and pick the relevant methods. Open a bug report and commit the fix. Of course, original authorship will somehow be spoiled, so be kind and cite them in commit message.

svenvc commented 5 years ago

Why go via Interval ?

Why not just

(2 ** 256) atRandom.

?

IM(H)O Interval as it currently stands is a mess, it is an endless source of bug reports for years now. Because it is not clear what it is and is not. People use them with all kinds of numbers (incl inexact floats) and expect all collection methods to work.

On 11 Feb 2019, at 12:13, StéphaneDucasse notifications@github.com wrote:

Hello community.

Kindly consider:

| n i r | n := ( 2 256 ) - 1 . "the largest 256-bit value" i := ( 0 to: n ) . "the interval 0 to 2256-1" "Produce three sets of computed test results." 3 timesRepeat: [ Transcript show: ( r := i atRandom ) ; cr . Transcript show: ( r asByteArray hex ) ; cr . ]

According to comments in Pharo library code, atRandom fails for integer magnitudes greater than 2**30 . A reason for this limitation is not given, and it's not clear to a novice such as myself why the computed result of atRandom is permitted to sometimes be in conflict with the semantics of atRandom .

Can anyone provide a basic explanation of the rationale for a 'broken' atRandom ?

Alternatively, is this a known bug?

Thanks. dr

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

-- Sven Van Caekenberghe - mailto:sven@beta9.be Beta Nine - software engineering - http://www.beta9.be

Ducasse commented 5 years ago

The "random" index for atRandom is, basically, computed prngOutput mod: selectionRange.

For large selection ranges, it fails because the period of the PRNG used by atRandom is 2^30th, with the output an integer in the range 1 - 2^30th. (in other words, the prng is a repeating sequence of every number between 1 and 2^30th)

That means:

The reason is techical; the default RNG is a reasonable compromise between speed, memory, and usable range. If the intended use requires a larger range, an appropriate random number generator can be provided using atRandom: It certainly could give an error and/or select another RNG automatically if outside applicable range, but no one has been bothered enough to do either, yet.

Cheers, Henry

Ducasse commented 5 years ago

Thanks for this link. Note that there are quite good classical PRNG already programmed for Pharo in package Polymath including a Mersenne twister and Marsaglia KISS as used in gfortran... (I did not check recently but there should be a catalog entry for Polymath). The one programmed in Squeak is a Mersenne Twister I think, carefully written for maximizing performance for 32bits VM (with 31bit SmallIntegers).

IMO, the Park-Miller PRNG was a good default choice in the 70s when it was introduced in Smalltalk. But now, we have fast enough machines to not bother with its limitations.

Le lun. 11 févr. 2019 à 15:58, David Richards < david.i.richards.iii at gmail.com> a écrit :

Thanks Nicolas and Henry,

For now, simple alternative expressions can easily generate the values I need for my project, such as:

(String new: 32) collect: [ :each | '0123456789abcdef' atRandom ]

Maybe in a few months (or years!) from now I will have acquired enough ambient familiarity with Pharo to attempt a fix. But, for the time being, I'll just avoid stepping on the landmine.

In doing some research on this problem I spotted this, which seemed to possibly point a way toward a cutting edge RNG implementation for Pharo:

http://xoshiro.di.unimi.it/#intro

[image: image.png]

bstjean commented 5 years ago

If all you need is a 64-bit PRNG, then SplitMix64 is a good candidate both for speed (uses bit shifting, xoring and multiplication only) and it's very robust (statistically speaking).

http://xorshift.di.unimi.it/splitmix64.c

Ducasse commented 1 month ago

In Pharo 12 I got

33665478090738534551265947698821085861731431479409397989099502699963737964544
4a6dfc4000000000000000000000000000000000000000000000000000000000
56737219946914414999965920440123632085542434921225913774821309228306904645632
7d7020e400000000000000000000000000000000000000000000000000000000
9899038960990396535394384800841929201551291814850384676665448694206894702592
15e2a85200000000000000000000000000000000000000000000000000000000
Ducasse commented 1 month ago

In P9

61525305964451392265031629257731411204474240315985566218943821361314187444224
880616d1100c3000000000000000000000000000000000000000000000000000
32460455300860748330048361511611026813408964515496646286299724575483742912512
47c3f7748f87f000000000000000000000000000000000000000000000000000
66339844569971116430435079087868031799733900749859557373570528374113358053376
92ab057b25560800000000000000000000000000000000000000000000000000