nsf / pnoise

Perlin noise benchmark
77 stars 23 forks source link

Different PRNGs Used #17

Open josephlewis42 opened 9 years ago

josephlewis42 commented 9 years ago

Each language here uses different RNGs.

If this test is for the performance of each builtin PRNG, this is fair, otherwise a single random implementation should be used for each language.

For example Java uses a version of Linear_congruential_generator

As does C/C++ under glibc

Python uses the better "mersenne twister".

Go uses 64 bit prandom integers on the other hand rather than a 32 bit which may slow performance.

nsf commented 9 years ago

It doesn't matter. PRNG doesn't affect performance of this benchmark. We're talking about ~512 or ~768 PRNG invocations versus 6553600 Get function invocations.

josephlewis42 commented 9 years ago

I believe it does:

I created two test scripts for go and c which generate 512 random numbers and return the last number so they couldn't be optimized away, and ran them using the same perf command with 10 invocations.

C 512 random using: 0.000915883 sec Go 512 random: 0.001429618 sec Python 512 random: 0.021473418 sec

These are order of magnitude differences. Small overall? Perhaps, yet they're uncontrolled variables.

Another one that may be thought of as "small" is doing all the casting from float64 in go to float32; using the native float64 that go uses internally shaves of ~30% of the time for me. (reducing .7s to .5s)

Just like the c program with -O3 enabled: 0.179705094 seconds time elapsed versus not 1.088510494 seconds time elapsed. In this case the overhead of function calls outweighs the math, if that's the goal of the test it's perfectly fine.

nsf commented 9 years ago

Numbers only prove my point. We're talking about 1ms spent in PRNG, it's just white noise. Python has of course overhead of an interpreter, because the benchmark itself runs on CPython in 1 minute 30 seconds. I don't include the numbers, but fastests JITs (both pypy and luajit) do it in 2 seconds. I don't think it's worth comparing compiled and typed languages against scripting ones.

As for native float64 vs float32, yes, it's a known issue and in my opinion it's a mistake made by most environments. For some reason people think in terms of an FPU, when modern processors are all about SIMD. And in SIMD the size matters. You can do an op on 2 doubles or 4 floats in one instruction. Moving code to double precision FP numbers also improves C# and Java benchmarks. But I explicitly want to keep things under single precision FP numbers.