ecraven / r7rs-benchmarks

Benchmarks for various Scheme implementations. Taken with kind permission from the Larceny project, based on the Gabriel and Gambit benchmarks.
270 stars 32 forks source link

Compute more repetitions for chudnovsky #21

Closed gambiteer closed 6 years ago

gambiteer commented 7 years ago

chudnovsky is a bignum benchmark, as is pi---they both compute up to 500 digits of $\pi$.

Some chudnovsky benchmark times are so small that one can't have confidence in their accuracy (compared to startup times, for example).

One could increase computation times by increasing the number of computed digits of $\pi$ (currently limited to 500) or by increasing the number of times each length of $\pi$ is computed (currently twice).

I think a general benchmark suite should test performance on smallish bignums, as they are more useful in practice, so I would recommend increasing the number of repetitions instead of increasing the size of the bignums involved.

So here's a proposed diff:

diff --git a/inputs/chudnovsky.input b/inputs/chudnovsky.input
index c310e06..46b9135 100644
--- a/inputs/chudnovsky.input
+++ b/inputs/chudnovsky.input
@@ -1,4 +1,4 @@
-2
+20
 50
 500
 50

Brad

zenspider commented 7 years ago

I tried 20 & 200 w/ racket and it didn't make a dent on the numbers. Maybe both?

gambiteer commented 7 years ago

I'm sorry, I don't understand what you mean.

Here's the current timing data for chudnovsky from all.csv:

chez-9.4.0-m64 | chudnovsky:50:500:50:2 | 0.000614821
chibi-unknown | chudnovsky:50:500:50:2 | 2.713
cyclone-0.5.2 | chudnovsky:50:500:50:2 | 0.004967
gambitc-v4.8.7 | chudnovsky:50:500:50:2 | 0.000966787338257
gauche-0.9.5 | chudnovsky:50:500:50:2 | 0.007426112
guile-2.2.2 | chudnovsky:50:500:50:2 | 0.001485105
kawa-2.4 (git describe: kawa-2.3-30-gdad3755-dirty) | chudnovsky:50:500:50:2 | 0.218102514
larceny-0.99 | chudnovsky:50:500:50:2 | 12.898318
petite-9.4.0-m64 | chudnovsky:50:500:50:2 | 0.000872008
sagittarius-0.8.3 | chudnovsky:50:500:50:2 | 0.003301888
vicare-0.4d0 | chudnovsky:50:500:50:2 | 0.000435
ypsilon-unknown | chudnovsky:50:500:50:2 | 1E-05

The Racket implementation evidently didn't work for the most recent run.

So, while many of these numbers are quite small (< 1/1000 of a second), I expect that if the number of iterations were increased, then the CPU times would increase proportionally, unless the CPU times are mainly startup times.

For example, with current Gambit on my machine, I see the following difference after increasing the number of iterations to 20:

heine:~/programs/r7rs-benchmarks> ./bench gambitc chudnovsky

Testing chudnovsky under GambitC
Including prelude /home/lucier/programs/r7rs-benchmarks/src/GambitC-prelude.scm
Compiling...
gambitc_comp /tmp/larcenous/GambitC/chudnovsky.scm /tmp/larcenous/GambitC/chudnovsky.exe
Running...
Running chudnovsky:50:500:50:2
Elapsed time: .0016713142395019531 seconds (0.) for chudnovsky:50:500:50:2
+!CSVLINE!+gambitc-v4.8.8,chudnovsky:50:500:50:2,.0016713142395019531

real    0m0.018s
user    0m0.008s
sys 0m0.000s
heine:~/programs/r7rs-benchmarks> vi inputs/chudnovsky.input 
heine:~/programs/r7rs-benchmarks> ./bench gambitc chudnovsky

Testing chudnovsky under GambitC
Including prelude /home/lucier/programs/r7rs-benchmarks/src/GambitC-prelude.scm
Compiling...
gambitc_comp /tmp/larcenous/GambitC/chudnovsky.scm /tmp/larcenous/GambitC/chudnovsky.exe
Running...
Running chudnovsky:50:500:50:20
Elapsed time: .016855716705322266 seconds (1.) for chudnovsky:50:500:50:20
+!CSVLINE!+gambitc-v4.8.8,chudnovsky:50:500:50:20,.016855716705322266

real    0m0.026s
user    0m0.020s
sys 0m0.000s

So the CPU times are roughly increased by a factor of 10.

zenspider commented 7 years ago

Racket works fine when you set it up correctly. See #22. I suspect a lot of the racket time is startup. Even going up 2 orders of magnitude didn't have an effect on wall clock at all.

Here's my current results:

# ... run the stuff from 2 to 2000 ...
10106 % git diff results.Racket | egrep "Running |Elapsed"
+Running chudnovsky:50:500:50:2
+Elapsed time: 0.001 seconds (0.001) for chudnovsky:50:500:50:2
+Running chudnovsky:50:500:50:20
+Elapsed time: 0.005 seconds (0.005) for chudnovsky:50:500:50:20
+Running chudnovsky:50:500:50:200
+Elapsed time: 0.057 seconds (0.057) for chudnovsky:50:500:50:200
+Running chudnovsky:50:500:50:2000
+Elapsed time: 0.475 seconds (0.475) for chudnovsky:50:500:50:2000
zenspider commented 7 years ago

So I would say at least go to 200... if not 2000, or bump both sides.

ETA: no... I really like your reasoning behind sticking to smaller bignums... then maybe 2000 or 20000 would be better. Something to make the numbers more comparable.

gambiteer commented 7 years ago

The run time for 200 is about 10 times the run time for 20; 200 would seem to be fine for comparing Racket to other implementations.

Or put it at 200 and increase the number of digits to 1000 instead of 500.

ecraven commented 7 years ago

Note that only the actual runtime is measured not startup time. So in the end, this shouldn't make much of a difference, unless you mean things like cache.

gambiteer commented 6 years ago

I'm concerned that jiffies are too coarse to measure accurately run times on the order of less than a millisecond, which happens for some schemes on the chudnovsky benchmark (down to 1e-5 seconds).

Chicken seems to measure times in milliseconds, for example.