Closed naggamura closed 9 months ago
Unfortunately, you still did not check everything. That's probably my fault because I didn't tell you everything.
Nevertheless, this is what I found this time:
CC=clang ./configure -pDBG; make
, my standard debug build, failed to build.CC=clang ./configure -gO0; make; make test
, a less strict debug build, failed the test suite.CC=clang ./configure -pGDH; make
, my standard release build for myself, failed to build.CC=clang ./configure -pGNU; make; make test
, the standard release build for Linux, failed the test suite.CC=gcc ./configure -pGNU; make; make test
, the same standard Linux build but with gcc
, also failed the test suite.CC=clang ./configure -pBSD; make; make test
, the standard release build for FreeBSD and friends, failed the test suite.CC=gcc ./configure -pBSD; make; make test
, the same standard FreeBSD build but with gcc
, also failed the test suite.In addition, what failed on the test suite was off by a lot; the result I got for all cases was:
Running bc sqrt...FAIL!!!
12c12
< .0000000000000035071355833500363
---
> .8915580480000035071355833500363
bc failed test sqrt
As you can see, the numbers are the same, except at the beginning, which means that answer was about as far off as it could be.
In addition, when I ran this:
$ ./configure -gO0 -v
$ make
$ make test
which is a Valgrind-enabled build, I got this:
Running bc sqrt...==47836== Memcheck, a memory error detector
==47836== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==47836== Using Valgrind-3.22.0 and LibVEX; rerun with -h for copyright info
==47836== Command: bin/bc -lqc ./tests/bc/sqrt.txt
==47836==
==47836== Invalid write of size 8
==47836== at 0x484C96E: memset (vg_replace_strmem.c:1386)
==47836== by 0x12A28F: bc_int_elementary_mul (num.c:4274)
==47836== by 0x1253AA: bc_int_k_mul (num.c:4327)
==47836== by 0x124972: bc_num_sqrt (num.c:4636)
==47836== by 0x13156F: bc_program_builtin (program.c:1990)
==47836== by 0x12DB0C: bc_program_exec (program.c:3332)
==47836== by 0x13D623: bc_vm_process (vm.c:1046)
==47836== by 0x13CCA5: bc_vm_file (vm.c:1103)
==47836== by 0x13C117: bc_vm_exec (vm.c:1498)
==47836== by 0x13B4E1: bc_vm_boot (vm.c:1718)
==47836== by 0x10C0CC: bc_main (bc.c:62)
==47836== by 0x119995: main (main.c:108)
==47836== Address 0x4aa2270 is 32 bytes inside a block of size 36 alloc'd
==47836== at 0x4840784: malloc (vg_replace_malloc.c:442)
==47836== by 0x13A4C3: bc_vm_malloc (vm.c:810)
==47836== by 0x11BA76: bc_num_init (num.c:3368)
==47836== by 0x1244B1: bc_num_sqrt (num.c:4567)
==47836== by 0x13156F: bc_program_builtin (program.c:1990)
==47836== by 0x12DB0C: bc_program_exec (program.c:3332)
==47836== by 0x13D623: bc_vm_process (vm.c:1046)
==47836== by 0x13CCA5: bc_vm_file (vm.c:1103)
==47836== by 0x13C117: bc_vm_exec (vm.c:1498)
==47836== by 0x13B4E1: bc_vm_boot (vm.c:1718)
==47836== by 0x10C0CC: bc_main (bc.c:62)
==47836== by 0x119995: main (main.c:108)
==47836==
==47836==
==47836== HEAP SUMMARY:
==47836== in use at exit: 0 bytes in 0 blocks
==47836== total heap usage: 1,633 allocs, 1,633 frees, 331,419 bytes allocated
==47836==
==47836== All heap blocks were freed -- no leaks are possible
==47836==
==47836== For lists of detected and suppressed errors, rerun with: -s
==47836== ERROR SUMMARY: 8 errors from 1 contexts (suppressed: 0 from 0)
FAIL!!!
bc failed test 'sqrt' with error code 100
That needs to be fixed, both the out-of-bounds write and the unaligned write.
If you're wondering how much testing I do, do this:
bc1/
and bc2/
.bc1/
, run scripts/release.sh 1 1 1 1 0 1 0 0 1 0 1 0 1 0 0 1 1
.bc2/
, run scripts/release.sh 1 1 1 0 1 0 1 0 1 0 1 0 0 1 1 1 1
.Yep, you heard that last bit right: my full test run is an overnight thing. And I run both because one does clang
, and one does gcc
.
Moving on to benchmarks, I ran
$ CC=clang CFLAGS=-flto ./configure -O3
$ make
in both branches. Then I ran
$ ./scripts/benchmark.sh -p1 -n50 bc newton_raphson_sqrt_small > ../sqrt2_small.txt
in your branch, copied benchmarks/bc/newton_raphson_sqrt_small.txt
to my branch and ran
$ ./scripts/benchmark.sh -p1 -n50 bc newton_raphson_sqrt_small > ../sqrt1_small.txt
These are my results:
$ ./scripts/ministat -w 80 -c99.5 -C1 ../sqrt1_small.txt ../sqrt2_small.txt
x ../sqrt1_small.txt
+ ../sqrt2_small.txt
+--------------------------------------------------------------------------------+
| + x |
| + x x |
| + x x |
| + + + x x |
| + + + x x x |
| + + + + x x x |
| + + + + x x x |
| + + + + x x x |
| + + + + x x x |
| + + + + + x x x x x x |
| + + + + + + x x x x x x x |
| + + + + + + x x x x x x x |
|+ + + + + + + * x x x x x x x|
| |________A_M_____| |________MA________| |
+--------------------------------------------------------------------------------+
N Min Max Median Avg Stddev
x 50 1.45 1.52 1.48 1.4806 0.016463937
+ 50 1.38 1.45 1.42 1.4164 0.015220824
Difference at 99.5% confidence
-0.0642 +/- 0.0100677
-4.33608% +/- 0.664237%
(Student's t, pooled s = 0.0158546)
$ ./scripts/ministat -w 80 -c99.5 -C4 ../sqrt1_small.txt ../sqrt2_small.txt
x ../sqrt1_small.txt
+ ../sqrt2_small.txt
+--------------------------------------------------------------------------------+
| + |
| + |
| x + |
| x x + |
| x x + + |
| x x +x + + + |
| x + xx +x + x + + |
| x *+ xx *x++ xx + + + |
| xx +*+xxxx x**++ xx + + + |
|x x xx +*+xxxx x***++xx*x+++++ + ++|
| |__________A_|_______M|A_________| |
+--------------------------------------------------------------------------------+
N Min Max Median Avg Stddev
x 50 18956 19244 19168 19168.16 57.880722
+ 50 19144 19376 19224 19232.88 54.73373
Difference at 99.5% confidence
64.72 +/- 35.769
0.337643% +/- 0.186939%
(Student's t, pooled s = 56.3292)
Yours was slightly faster and used slightly more memory.
I ran the same benchmark for large, but only for one sample each, and my numbers say that yours are accurate for that benchmark.
However, I cannot accept code that fails tests or has memory bugs. In addition, because of the test failures, you can expect that I will run a lot more tests on your code before I accept.
So I won't close this PR, but you need to fix the problems above (by running that release.sh
script both ways and fixing every problem) before I'll look at it again.
Also, I can't understand sqrt_2.png
. I can follow sqrt_1.png
, but I need you to redo sqrt_2.png
because I also won't accept the code if I don't understand it.
Last push was not the latest version. Pushed again.
Anyway, test again and passed all tests.
... ==5965== ==5965== HEAP SUMMARY: ==5965== in use at exit: 0 bytes in 0 blocks ==5965== total heap usage: 1,520 allocs, 1,520 frees, 205,685 bytes allocated ==5965== ==5965== All heap blocks were freed -- no leaks are possible ==5965== ==5965== For counts of detected and suppressed errors, rerun with: -v ==5965== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0) pass Running bc directory test...pass Running bc binary file test...pass Running bc binary stdin test...pass Running bc limits tests...pass
All bc tests passed.
... ...
==6927== ==6927== HEAP SUMMARY: ==6927== in use at exit: 0 bytes in 0 blocks ==6927== total heap usage: 73 allocs, 73 frees, 37,886 bytes allocated ==6927== ==6927== All heap blocks were freed -- no leaks are possible ==6927== ==6927== For counts of detected and suppressed errors, rerun with: -v ==6927== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0) pass Running dc directory test...pass Running dc binary file test...pass Running dc binary stdin test...pass
All dc tests passed.
Doing scripts/release.sh test...
Passed the tests.
Benchmarks
stat_small_c1.txt stat_small_c4.txt stat_large_c1.txt stat_large_c4.txt
sqrt_2.png updated.
I have spent two full days on this since the last time I sent a message. I cannot spend any more time on this; I have a project to finish and a hard deadline.
So I have to make a final decision, which is to reject this. Thank you for your contribution, but I guess I don't want it.
There are several reasons why:
But most of all, and this is nothing against you, I just can't read other people's code. I am unable to build a theory of mind, and this extends to reading code written by other people.
There is a reason I tend to do things by myself; I can't work on the code otherwise.
So I'm afraid I have run out of time and desire. Sorry.
I'm sorry I missed many things on last PR. I followed your instructions step by step.
Improved SQRT needs no long division in iteration. Only one long division is needed for compose an initial estimation.
No meaningful difference for small input.
For large input, about 2.4 times faster, no meaningful difference in memory usage.
Here is the result for large input.
( 'CC=clang CFLAGS=-flto ./configure -O3' is used for configuring. ) stat_c1.txt stat_c4.txt
sqrt1_large.txt : by your code. sqrt2_large.txt : by my code.
Elapsed time for large input.
x ../../sqrt1_large.txt
Memory for large input.
x ../../sqrt1_large.txt
../../sqrt2_large.txt +--------------------------------------------------------------------------------+ | + | | + x | | + x | | + + x x x | | + + + x x x | | + + ++ ++ xx x x x x x | |+ ++++ +++++ + +++ x xx x x x xx xxx xx| |+ + +++++++++++++++++ + x xxxxxxxxx xxxx xxxx xx| | |AM| |__M_A____| | +--------------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 50 5376 5596 5494 5508.96 51.80535
50 5184 5412 5302 5301.12 44.115174 Difference at 99.5% confidence -207.84 +/- 30.5525 -3.77276% +/- 0.542566% (Student's t, pooled s = 48.1141)
END -