libmir / mir

Mir (backports): Sparse tensors, Hoffman
http://mir.libmir.org
Boost Software License 1.0
210 stars 20 forks source link

rand!float() exponent problem #402

Closed mlabayru closed 5 years ago

mlabayru commented 5 years ago

Hi, after long using mir.random I noticed today that gen.rand!float() does return a floating point between -1.0 and 1.0 as documented, but I assumed exponent would go from e-23 to 0 but it is fixed at 0. I think this should be documented or fixed as this limits a lot the space of random numbers that is generated. Thank you

9il commented 5 years ago

Hi @mlabayru , I can't reproduce the bug.

Check this at run.dlang.org

/+dub.sdl:
dependency "mir" version="~>3.2.0"
+/
import mir.random;
import mir.math.common;
import std.stdio: write;
void main()
{
    foreach(i; 0 .. 1000)
        rand!float.fabs.log2.ceil.write(" "); // prints different exponents
}

Please submit an example that has fixed exponent for you.

mlabayru commented 5 years ago

You are right. The problem i found is because of this distribution of exponents:

/+dub.sdl: dependency "mir" version="~>3.2.0" +/ import mir.random; import mir.math.common; import std.stdio: write;

void main() { int[int] count_exp; foreach(i; 0 .. 10000000) count_exp[cast(int)(rand!float().fabs.log2.ceil)]++; write(count_exp); }

This returns this:

[0:5000172, -1:2501446, -16:75, -14:316, -18:23, -25:1, -20:2, -7:38899, -8:19660, -22:1, -21:4, -3:624257, -12:1208, -17:35, -13:603, -15:154, -4:311889, -11:2399, -2:1250060, -9:9660, -6:78184, -5:156003, -23:1, -19:12, -10:4936]

As you see from 10000000 samples more than 5000000 have exponent 0, half of that have exponent -1, and each exponent seems to have half the probability of the previous one.

In my program as i was only taking a few samples almost all of them had exponent 0. In my particular case this is not optimal, i expected the same probability of getting any exponent(i.e random mantissa and random exponent). Sorry for the inconvenience, but I think this should be documented because it can cause problems in certain cases if you expect the behaviour i describe.

9il commented 5 years ago

It is documented: http://docs.random.dlang.io/latest/mir_random.html#.rand.4

Uniformly distributed real for interval (-2^^boundExp , 2^^boundExp)

The exponent of a uniformly distributed variable is a geometric variable (with some notes).