Closed naggamura closed 7 months ago
Thank you for your PR. I have some comments.
CC=clang ./configure.sh -pDBG; make
, my regular debug build, the build fails.CC=clang ./configure.sh -pGDH; make
, my regular release build, the build fails.CC=gcc CFLAGS="-pedantic -Wextra -Werror" ./configure -O3; make
, the build fails.CC=clang ./configure -gO3; make; make test
, a release build with debug information, the build succeeds, but the test suite fails on an assert.scripts/format.sh
) that will style the code for you.In addition, I can unfortunately tell that you did no research into my bc
. If you had, you would know that I already use Newton-Raphson for sqrt()
.
However, all of that could be safely discarded if the code shows promise. So I wrote benchmarks to test your new code.
I ran these commands with my code:
$ CC=clang CFLAGS=-flto ./configure -O3
$ make
Then I ran the benchmarks as follows:
$ ./scripts/benchmark.sh -p1 -n50 bc newton_raphson_div_small > ../div1_small.txt
$ ./scripts/benchmark.sh -p1 -n50 bc newton_raphson_div_large > ../div1_large.txt
$ ./scripts/benchmark.sh -p1 -n50 bc newton_raphson_sqrt_small > ../sqrt1_small.txt
$ ./scripts/benchmark.sh -p1 -n50 bc newton_raphson_sqrt_large > ../sqrt1_large.txt
Then I repeated the steps with your code:
$ CC=clang CFLAGS=-flto ./configure -O3
$ make
$ ./scripts/benchmark.sh -p1 -n50 bc newton_raphson_div_small > ../div2_small.txt
$ ./scripts/benchmark.sh -p1 -n50 bc newton_raphson_div_large > ../div2_large.txt
$ ./scripts/benchmark.sh -p1 -n50 bc newton_raphson_sqrt_small > ../sqrt2_small.txt
$ ./scripts/benchmark.sh -p1 -n50 bc newton_raphson_sqrt_large > ../sqrt2_large.txt
The small
benchmarks test the operation on "small" numbers (below 2^64
), and the large
benchmarks test the operation on large numbers (a small number to a power up to 1,000,000).
To test how your code did, I used ministat
in the repo, so I ran this:
$ make ministat
Then to test the time difference on div_small
, I ran the following:
$ ./scripts/ministat -w80 -c95 -C1 ../div1_small.txt ../div2_small.txt
x ../div1_small.txt
+ ../div2_small.txt
+--------------------------------------------------------------------------------+
| + x |
| + + x * |
| + + * * x |
| + * * * x |
| + * * * x |
| + * * * x |
| * * * * x + |
| * * * * * x + x |
|* * * * * * x * + x |
|* * * * * * * * * x x + +|
||____|____M_AA______|___| |
+--------------------------------------------------------------------------------+
N Min Max Median Avg Stddev
x 50 3.41 3.52 3.45 3.4504 0.024071323
+ 50 3.41 3.67 3.44 3.4484 0.039967334
No difference proven at 95.0% confidence
As you can tell, there's no difference at the lowest confidence I can accept.
To test the memory difference (max RSS) for div_small
, I ran the following:
$ ./scripts/ministat -w80 -c95 -C4 ../div1_small.txt ../div2_small.txt
x ../div1_small.txt
+ ../div2_small.txt
+--------------------------------------------------------------------------------+
| x x x+ + + x +xx xx ++ |
| x x + x * x+ +++ ++ x x *+ +xx xx ++ + x + |
|xx x +++ + + x x*x x* *++ ++ x x +x ** +** ** x++ +* +x + +xx ++|
| |_____|_____________A____A_M___________|___| |
+--------------------------------------------------------------------------------+
N Min Max Median Avg Stddev
x 50 34792 35000 34912 34896.88 55.262136
+ 50 34808 35016 34916 34910 52.956663
No difference proven at 95.0% confidence
No memory difference. Cool.
For the elapsed time for div_large
:
$ ./scripts/ministat -w80 -c99.5 -C1 ../div1_large.txt ../div2_large.txt
x ../div1_large.txt
+ ../div2_large.txt
+--------------------------------------------------------------------------------+
| x + |
| x ++ |
| x ++ |
| x ++ |
| x ++ |
| x ++ |
| x ++ |
| x ++ |
| x ++ |
| x ++ |
| xx ++ |
| xx ++ |
| xx ++ |
| xx ++ |
| xx ++ |
| xx +++ |
| xx ++++ |
| xx ++++ |
| xxx ++++ |
| xxx x ++++++|
||____MA_____| |A| |
+--------------------------------------------------------------------------------+
N Min Max Median Avg Stddev
x 50 13.72 21.71 13.845 14.013 1.1141964
+ 50 26.7 27.64 27.005 27.0158 0.17722452
Difference at 99.5% confidence
13.0028 +/- 0.506578
92.791% +/- 6.90636%
(Student's t, pooled s = 0.79776)
As you can see, there is a difference, with 99.5% confidence, but in favor of the old code. In fact, if not for the one outlier, your code would be twice as slow.
And if we look at the max RSS difference for div_large
:
$ ./scripts/ministat -w80 -c99.5 -C4 ../div1_large.txt ../div2_large.txt
x ../div1_large.txt
+ ../div2_large.txt
+--------------------------------------------------------------------------------+
| + |
| + |
| + |
| x + |
| x + |
| x + |
| x + |
| x + |
| x + |
| x + |
| x + |
| x + |
| x + |
| x + |
| x + |
| x + |
| x + |
| x + |
| x + |
| x + |
| xx + |
| xx ++|
| xx +++|
|xxx +++|
|xxx +++|
|xxx +++|
|xxx +++|
|xxx +++|
|xxx +++|
|xxx x +++|
|xxxxx +++|
||A| |A||
+--------------------------------------------------------------------------------+
N Min Max Median Avg Stddev
x 50 8716 8968 8796 8802.72 57.549638
+ 50 13812 13960 13888 13890.08 41.837314
Difference at 99.5% confidence
5087.36 +/- 31.9473
57.793% +/- 0.51%
(Student's t, pooled s = 50.3106)
Your code uses 50% more memory, with 99.5% confidence.
Moving onto sqrt_small
, for elapsed time, I got this:
$ ./scripts/ministat -w80 -c99.5 -C1 ../sqrt1_small.txt ../sqrt2_small.txt
x ../sqrt1_small.txt
+ ../sqrt2_small.txt
+--------------------------------------------------------------------------------+
| x + |
| xx + |
| xx + |
| xx + |
| xx + + |
| xx ++++ |
| xx ++++ |
|xxx ++++ |
|xxx ++++ |
|xxxx +++++ |
|xxxx +++++ |
|xxxx +++++ |
|xxxxx +++++ |
|xxxxx x +++++++ +|
| MA| |_A_| |
+--------------------------------------------------------------------------------+
N Min Max Median Avg Stddev
x 50 1.47 1.6 1.5 1.5072 0.023822473
+ 50 3.07 3.26 3.14 3.1452 0.036153866
Difference at 99.5% confidence
1.638 +/- 0.0194408
108.678% +/- 1.83123%
(Student's t, pooled s = 0.0306155)
Your code is twice as slow with 99.5% confidence.
For memory on sqrt_small
, I got this:
$ ./scripts/ministat -w80 -c99.5 -C4 ../sqrt1_small.txt ../sqrt2_small.txt
x ../sqrt1_small.txt
+ ../sqrt2_small.txt
+--------------------------------------------------------------------------------+
| xx + x |
| x xx + x x ++ |
| x x x xx + + x x ++ |
| x xxxx x xxx++ + x x +x x +++ ++ + |
|xx + x x + + xxxxx + + +* *xx***+++ x+xx++***++++ ++++ ++++|
| |_____________|__M__________|A_M____________| |
+--------------------------------------------------------------------------------+
N Min Max Median Avg Stddev
x 50 18964 19252 19176 19160.4 68.353224
+ 50 19004 19344 19242 19231.68 71.990203
Difference at 99.5% confidence
71.28 +/- 44.574
0.372017% +/- 0.233047%
(Student's t, pooled s = 70.1953)
So there's a small increase of memory in your code, even at 99.5% confidence, but that's small enough that I can ignore it.
For time on sqrt_large
, I got this:
$ ./scripts/ministat -w80 -c99.5 -C1 ../sqrt1_large.txt ../sqrt2_large.txt
x ../sqrt1_large.txt
+ ../sqrt2_large.txt
+--------------------------------------------------------------------------------+
| xx + |
| xx ++ |
| xx +++ |
| xx +++ |
| xx +++ |
| xx +++ |
| xx ++++ |
| xx ++++ |
| xxx +++++ |
|xxxx +++++ |
|xxxx +++++ |
|xxxx +++++ |
|xxxxx +++++++|
| |A| |A_| |
+--------------------------------------------------------------------------------+
N Min Max Median Avg Stddev
x 50 176.9 184.63 180.565 180.403 1.6383681
+ 50 327.42 339.26 331.24 331.3832 2.8256482
Difference at 99.5% confidence
150.98 +/- 1.4666
83.6905% +/- 1.02747%
(Student's t, pooled s = 2.3096)
Your code is almost twice as slow at 99.5% confidence.
For memory on sqrt_large
, I got this:
$ ./scripts/ministat -w80 -c99.5 -C4 ../sqrt1_large.txt ../sqrt2_large.txt
x ../sqrt1_large.txt
+ ../sqrt2_large.txt
+--------------------------------------------------------------------------------+
| x + |
| x + |
| x + |
| x + |
| x + |
| x + |
| x + |
| x + |
| x + |
| x + |
| x + |
| x + |
| x + |
| x + |
| x + |
| x + |
| x + |
| x + |
| x + |
| x + |
| x + |
| x + |
| x ++|
| x ++|
| x ++|
| x ++|
| x ++|
| x ++|
| x ++|
|xx ++|
|xx ++|
|xx ++|
|xx ++|
|xx ++|
|xxx ++|
|xxx ++|
||A A||
+--------------------------------------------------------------------------------+
N Min Max Median Avg Stddev
x 50 6952 7156 7044 7046.48 42.561955
+ 50 15412 15568 15496 15485.84 51.175313
Difference at 99.5% confidence
8439.36 +/- 29.887
119.767% +/- 0.679408%
(Student's t, pooled s = 47.0661)
Your code uses twice as much memory at 99.5% confidence.
So on average, your code is twice as slow and uses more memory.
Unfortunately, based on the poor performance of your code compared to what already exists, I can't say that it shows promise.
Combined with the problems I mentioned at the beginning, I'm afraid I must reject this PR completely. Sorry.
Newton-Raphson approximation is applied to division and sqrt. Try 5^1000000 / 2^1000000 and sort(5^999999)