BrunoLevy / geogram

a programming library with geometric algorithms
Other
1.8k stars 122 forks source link

Use standard library functions for random number generator #125

Closed weynaa closed 7 months ago

weynaa commented 9 months ago

Hi,

First of all, great library and good work! I am using geogram mainly on Windows with MSVC and found an interesting edge case when computing RVDs on high polycount meshes. In contrast to GCC, the constant RAND_MAX is only 0x7fff on MSVC, which caused an unevenly distributed sampling to be generated in compute_initial_sampling.

Initial sampling with current code (input: Nefretiti with ~2M faces, nb_points=500,000 )

image

I replaced the rand()\srand() functions in numeric.cpp with a c++11 random engine and the distribution was much more even:

image

In some remeshing cases, this uneven initial sampling meant that I had to use a higher number of iterations or the solution did not converge.

If possible, could we replace the random generator functions with the ones from the C++ standard library, or at least with one that has higher entropy and is not toolset-specific? I am not sure what the minimum compiler support is for you.

Thank you and best regards,

Michael

ramcdona commented 8 months ago

Story time.... About a year or two ago (long before I knew of Geogram), I had a bug related to rand() / srand() in my own program interacting with a library.

As it turns out, the library (libxml2 I think) was calling srand( timer() ) during the first call into the library. This happened during the startup phase of my program -- where I had already called srand( timer() ) and started generating random numbers used to generate a series of unique ID's.

Well, 'sometimes' those two calls to srand() would happen within the resolution of timer() -- which would re-seed the RNG with the same seed and start the sequence over again. This in turn would cause my series of unique ID's to no longer be unique. This caused terrible Heisenbugs that were extremely challenging to chase down. I was pretty proud when I figured it all out.

As a user of a library, I was surprised and furious that the library would make a call to srand() -- thereby changing the shared standard library random number sequence. To me, the application should own the seed / sequence and the library should not touch it.

I see that Geogram actually calls srand(1) -- which is even more terrifying. While it is a way for Geogram (a library) to get a deterministic and repeatable sequence of random numbers, you've totally destroyed the randomness for the host application. I strongly emplore you to find another way.

Calling srand(1) doesn't even guarantee that Geogram will get a repeatable sequence -- if the host application also draws from rand(), your sequence will depend on the host application's behavior.

The lesson I learned was to never use a RNG with a shared seed. In retrospect, having a shared seed is a pretty terrible design flaw of the C Standard Library (inherited by C++). Our forefathers should have known better.

Switching to C++11's RNG's in random.h should be a good solution -- your program will need to maintain the context of the RNG in order to pull numbers from a sequence, but in this way, you will know that you are the only one pulling from the sequence and that nobody can mess with your seed.

For my program, I chose to use pcg-cpp and I've been happy ever since.

BrunoLevy commented 7 months ago

Thank you very much for this super useful PR ! Will be included in v 1.8.8