Closed jeffplaisance closed 6 years ago
I found the same issues, and came to report them. The benchmark reports errors when compiled with gcc-4.8, but appears to work for gcc-4.7, gcc-4.9, clang-3.5, and icc-14.
ICC's values look to be 3x faster for the system divide, but I'd worry that this is because the compiler has somehow managed to omit part of the loop. Since it's times match the other compilers for certain values but not others, I'm wondering if the other compilers might also be optimizing out the test in some cases.
nate@haswell:~/git/libdivide$ gcc-4.7 -march=native -fstrict-aliasing -W -Wall -g -O3 -DLIBDIVIDE_USE_SSE2=1 -o benchmark libdivide_benchmark.c -lpthread libdivide_benchmark.c: In function ‘random_data’: libdivide_benchmark.c:791:19: warning: ignoring return value of ‘posix_memalign’, declared with attribute warn_unused_result [-Wunused-result] nate@haswell:~/git/libdivide$ benchmark
1 3.538 0.109 0.122 0.517 0.183 0.778 0
2 3.538 0.101 0.114 0.517 0.175 0.778 0
3 3.538 0.372 0.387 0.515 0.391 10.101 1
4 3.538 0.099 0.113 0.515 0.174 0.793 0
5 3.538 0.366 0.389 0.515 0.399 10.101 1
6 3.538 0.368 0.389 0.515 0.389 10.117 1
7 3.538 0.385 0.412 0.517 0.492 10.361 2
8 3.538 0.099 0.111 0.515 0.174 0.778 0
9 3.538 0.368 0.389 0.515 0.389 10.101 1
10 3.538 0.374 0.389 0.515 0.389 10.101 1
11 3.536 0.374 0.389 0.515 0.389 10.101 1
12 3.538 0.366 0.389 0.517 0.389 10.101 1
13 3.536 0.368 0.389 0.515 0.389 10.101 1
14 3.538 0.383 0.412 0.517 0.490 10.208 2
15 3.538 0.374 0.389 0.517 0.389 10.101 1
nate@haswell:~/git/libdivide$ gcc-4.8 -march=native -fstrict-aliasing -W -Wall -g -O3 -DLIBDIVIDE_USE_SSE2=1 -o benchmark libdivide_benchmark.c -lpthread libdivide_benchmark.c: In function ‘random_data’: libdivide_benchmark.c:791:19: warning: ignoring return value of ‘posix_memalign’, declared with attribute warn_unused_result [-Wunused-result] posix_memalign(&ptr, 16, multiple * ITERATIONS * sizeof(uint32_t)); ^ nate@haswell:~/git/libdivide$ benchmark
1 3.538 0.099 0.103 0.166 0.168 0.793 0
2 3.536 0.099 0.103 0.166 0.168 0.793 0
3 3.536 0.288 0.254 0.391 0.391 10.147 1
4 3.536 0.101 0.103 0.166 0.166 0.793 0
5 3.536 0.288 0.254 0.391 0.389 10.117 1
6 3.536 0.286 0.254 0.391 0.391 10.117 1
Failure on line 651 Failure on line 652 Failure on line 651 Failure on line 652 Failure on line 651 Failure on line 652 7 3.538 0.324 0.298 0.486 0.486 10.300 2 8 3.538 0.099 0.103 0.166 0.168 0.778 0 9 3.536 0.288 0.254 0.391 0.391 10.117 1 10 3.538 0.288 0.254 0.391 0.391 10.117 1 11 3.538 0.286 0.254 0.391 0.389 10.117 1 12 3.536 0.288 0.254 0.391 0.391 10.101 1 13 3.538 0.286 0.254 0.391 0.389 10.132 1 Failure on line 651 Failure on line 652 Failure on line 651 Failure on line 652 Failure on line 651 Failure on line 652 14 3.536 0.324 0.298 0.486 0.486 10.315 2 15 3.538 0.288 0.254 0.391 0.391 10.117 1
nate@haswell:~/git/libdivide$ gcc-4.9 -march=native -fstrict-aliasing -W -Wall -g -O3 -DLIBDIVIDE_USE_SSE2=1 -o benchmark libdivide_benchmark.c -lpthread nate@haswell:~/git/libdivide$ benchmark
1 3.536 0.116 0.118 0.517 0.147 0.687 0
2 3.536 0.103 0.105 0.561 0.134 0.687 0
3 3.538 0.391 0.393 0.515 0.404 4.715 1
4 3.538 0.099 0.101 0.517 0.132 0.687 0
5 3.536 0.391 0.391 0.515 0.404 4.715 1
6 3.538 0.391 0.391 0.517 0.395 4.715 1
7 3.538 0.397 0.414 0.519 0.498 5.051 2
8 3.538 0.099 0.101 0.517 0.132 0.687 0
9 3.538 0.391 0.391 0.517 0.404 4.715 1
10 3.536 0.391 0.391 0.515 0.404 4.715 1
11 3.536 0.391 0.391 0.515 0.404 4.715 1
12 3.536 0.391 0.397 0.515 0.402 4.715 1
13 3.538 0.391 0.391 0.515 0.395 4.715 1
14 3.538 0.397 0.414 0.519 0.498 5.051 2
15 3.536 0.391 0.391 0.515 0.404 4.715 1
nate@haswell:~/git/libdivide$ clang-3.5 -march=native -fstrict-aliasing -W -Wall -g -O3 -DLIBDIVIDE_USE_SSE2=1 -o benchmark libdivide_benchmark.c -lpthread
In file included from libdivide_benchmark.c:1:
./libdivide.h:290:29: warning: variable 'result' is uninitialized when used here
[-Wuninitialized]
result = _mm_cmpeq_epi8(result, result); //all 1s
^~
./libdivide.h:289:5: note: variable 'result' is declared here
__m128i result; //we don't care what its contents are
^
1 warning generated.
nate@haswell:~/git/libdivide$ benchmark
1 2.911 0.097 0.097 0.114 0.114 2.075 0
2 2.935 0.097 0.097 0.116 0.114 2.075 0
3 2.937 0.706 0.713 0.397 0.401 10.666 1
4 2.918 0.097 0.099 0.114 0.114 2.060 0
5 2.918 0.731 0.713 0.397 0.401 10.651 1
6 2.918 0.732 0.715 0.399 0.399 10.666 1
7 2.916 0.816 0.818 0.502 0.496 12.344 2
8 2.911 0.097 0.097 0.114 0.114 2.060 0
9 2.911 0.704 0.715 0.399 0.401 10.651 1
10 2.913 0.734 0.715 0.399 0.401 10.651 1
11 2.911 0.731 0.715 0.399 0.401 10.666 1
12 2.911 0.734 0.713 0.399 0.399 10.666 1
13 2.911 0.731 0.713 0.399 0.401 10.666 1
14 2.911 0.816 0.818 0.504 0.494 12.360 2
15 2.911 0.732 0.715 0.399 0.401 10.666 1
nate@haswell:~/git/libdivide$ icc-14 -march=native -fstrict-aliasing -W -Wall -g -O3 -DLIBDIVIDE_USE_SSE2=1 -o benchmark libdivide_benchmark.c -lpthread nate@haswell:~/git/libdivide$ benchmark
1 1.089 0.095 0.095 0.221 0.162 5.020 0
2 1.089 0.095 0.095 0.221 0.162 5.005 0
3 1.110 0.257 0.257 0.389 0.389 6.897 1
4 1.089 0.095 0.095 0.221 0.162 5.005 0
5 1.089 0.257 0.257 0.389 0.389 6.897 1
6 1.110 0.257 0.257 0.389 0.389 6.897 1
7 1.108 0.269 0.284 0.496 0.481 6.912 2
8 1.089 0.095 0.095 0.221 0.162 5.005 0
9 1.097 0.259 0.257 0.389 0.391 6.897 1
10 1.110 0.257 0.257 0.389 0.389 6.912 1
11 1.108 0.257 0.257 0.389 0.389 6.912 1
12 1.108 0.259 0.257 0.389 0.389 6.897 1
13 1.108 0.259 0.257 0.387 0.391 6.897 1
14 1.110 0.271 0.284 0.496 0.481 6.897 2
15 1.108 0.257 0.257 0.389 0.391 6.912 1
The benchmark errors in gcc 4.8 are a gcc bug. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61108.
i'm having problems with libdivide using the default gcc installed with ubuntu 10.10. everything is fine on gcc 4.7.1. maybe put a note in the readme that a certain version of gcc is needed?
when i run the tester everything passes but when i run the benchmark it prints this a bunch of times:
when compiling the benchmark in this version of gcc there are a lot of warnings: