ridiculousfish / libdivide

Official git repository for libdivide: optimized integer division
http://libdivide.com
Other
1.09k stars 77 forks source link

problems with gcc version (Ubuntu/Linaro 4.4.4-14ubuntu5.1) 4.4.5 #1

Closed jeffplaisance closed 6 years ago

jeffplaisance commented 11 years ago

i'm having problems with libdivide using the default gcc installed with ubuntu 10.10. everything is fine on gcc 4.7.1. maybe put a note in the readme that a certain version of gcc is needed?

when i run the tester everything passes but when i run the benchmark it prints this a bunch of times:

    50   4.210   0.422   0.422   0.000   0.000   4.974     1
Failure on line 653
Failure on line 654
Failure on line 653
Failure on line 654
Failure on line 653
Failure on line 654

when compiling the benchmark in this version of gcc there are a lot of warnings:

jplaisance@jplaisance:~/source/libdivide$ make benchmark
cc -fstrict-aliasing -W -Wall -g -O3 -msse2 -DLIBDIVIDE_USE_SSE2=1  -lpthread   -o benchmark libdivide_benchmark.c
libdivide_benchmark.c: In function ‘test_many_u64’:
libdivide_benchmark.c:765: warning: format ‘%llu’ expects type ‘long long unsigned int’, but argument 3 has type ‘uint64_t’
libdivide_benchmark.c: In function ‘test_many_s64’:
libdivide_benchmark.c:776: warning: format ‘%lld’ expects type ‘long long int’, but argument 3 has type ‘int64_t’
libdivide_benchmark.c: In function ‘random_data’:
libdivide_benchmark.c:790: warning: ignoring return value of ‘posix_memalign’, declared with attribute warn_unused_result
libdivide_benchmark.c: In function ‘mine_s32_vector’:
libdivide_benchmark.c:239: warning: dereferencing pointer ‘comps’ does break strict-aliasing rules
libdivide_benchmark.c:238: note: initialized from here
libdivide_benchmark.c:239: warning: dereferencing pointer ‘({anonymous})’ does break strict-aliasing rules
libdivide_benchmark.c:239: note: initialized from here
libdivide_benchmark.c:239: warning: dereferencing pointer ‘({anonymous})’ does break strict-aliasing rules
libdivide_benchmark.c:239: note: initialized from here
libdivide_benchmark.c:239: warning: dereferencing pointer ‘({anonymous})’ does break strict-aliasing rules
libdivide_benchmark.c:239: note: initialized from here
libdivide_benchmark.c: In function ‘mine_s64_vector’:
libdivide_benchmark.c:489: warning: dereferencing pointer ‘comps’ does break strict-aliasing rules
libdivide_benchmark.c:488: note: initialized from here
libdivide_benchmark.c:489: warning: dereferencing pointer ‘({anonymous})’ does break strict-aliasing rules
libdivide_benchmark.c:489: note: initialized from here
libdivide_benchmark.c: In function ‘mine_u32_vector_unswitched’:
libdivide_benchmark.c:155: warning: dereferencing pointer ‘comps’ does break strict-aliasing rules
libdivide_benchmark.c:154: note: initialized from here
libdivide_benchmark.c:155: warning: dereferencing pointer ‘({anonymous})’ does break strict-aliasing rules
libdivide_benchmark.c:155: note: initialized from here
libdivide_benchmark.c:155: warning: dereferencing pointer ‘({anonymous})’ does break strict-aliasing rules
libdivide_benchmark.c:155: note: initialized from here
libdivide_benchmark.c:155: warning: dereferencing pointer ‘({anonymous})’ does break strict-aliasing rules
libdivide_benchmark.c:155: note: initialized from here
libdivide_benchmark.c: In function ‘mine_u32_vector’:
libdivide_benchmark.c:127: warning: dereferencing pointer ‘comps’ does break strict-aliasing rules
libdivide_benchmark.c:126: note: initialized from here
libdivide_benchmark.c:127: warning: dereferencing pointer ‘({anonymous})’ does break strict-aliasing rules
libdivide_benchmark.c:127: note: initialized from here
libdivide_benchmark.c:127: warning: dereferencing pointer ‘({anonymous})’ does break strict-aliasing rules
libdivide_benchmark.c:127: note: initialized from here
libdivide_benchmark.c:127: warning: dereferencing pointer ‘({anonymous})’ does break strict-aliasing rules
libdivide_benchmark.c:127: note: initialized from here
libdivide_benchmark.c: In function ‘mine_s32_vector_unswitched’:
libdivide_benchmark.c:281: warning: dereferencing pointer ‘comps’ does break strict-aliasing rules
libdivide_benchmark.c:280: note: initialized from here
libdivide_benchmark.c:281: warning: dereferencing pointer ‘({anonymous})’ does break strict-aliasing rules
libdivide_benchmark.c:281: note: initialized from here
libdivide_benchmark.c:281: warning: dereferencing pointer ‘({anonymous})’ does break strict-aliasing rules
libdivide_benchmark.c:281: note: initialized from here
libdivide_benchmark.c:281: warning: dereferencing pointer ‘({anonymous})’ does break strict-aliasing rules
libdivide_benchmark.c:281: note: initialized from here
libdivide_benchmark.c: In function ‘mine_u64_vector’:
libdivide_benchmark.c:435: warning: dereferencing pointer ‘comps’ does break strict-aliasing rules
libdivide_benchmark.c:434: note: initialized from here
libdivide_benchmark.c:435: warning: dereferencing pointer ‘({anonymous})’ does break strict-aliasing rules
libdivide_benchmark.c:435: note: initialized from here
libdivide_benchmark.c: In function ‘mine_u64_vector_unswitched’:
libdivide_benchmark.c:422: warning: dereferencing pointer ‘comps’ does break strict-aliasing rules
libdivide_benchmark.c:421: note: initialized from here
libdivide_benchmark.c:422: warning: dereferencing pointer ‘({anonymous})’ does break strict-aliasing rules
libdivide_benchmark.c:422: note: initialized from here
libdivide_benchmark.c: In function ‘mine_s64_vector_unswitched’:
libdivide_benchmark.c:532: warning: dereferencing pointer ‘comps’ does break strict-aliasing rules
libdivide_benchmark.c:531: note: initialized from here
libdivide_benchmark.c:532: warning: dereferencing pointer ‘({anonymous})’ does break strict-aliasing rules
libdivide_benchmark.c:532: note: initialized from here
nkurz commented 8 years ago

I found the same issues, and came to report them. The benchmark reports errors when compiled with gcc-4.8, but appears to work for gcc-4.7, gcc-4.9, clang-3.5, and icc-14.

ICC's values look to be 3x faster for the system divide, but I'd worry that this is because the compiler has somehow managed to omit part of the loop. Since it's times match the other compilers for certain values but not others, I'm wondering if the other compilers might also be optimizing out the test in some cases.

nate@haswell:~/git/libdivide$ gcc-4.7 -march=native -fstrict-aliasing -W -Wall -g -O3 -DLIBDIVIDE_USE_SSE2=1 -o benchmark libdivide_benchmark.c -lpthread libdivide_benchmark.c: In function ‘random_data’: libdivide_benchmark.c:791:19: warning: ignoring return value of ‘posix_memalign’, declared with attribute warn_unused_result [-Wunused-result] nate@haswell:~/git/libdivide$ benchmark

system scalar scl_us vector vec_us gener algo

 1   3.538   0.109   0.122   0.517   0.183   0.778     0
 2   3.538   0.101   0.114   0.517   0.175   0.778     0
 3   3.538   0.372   0.387   0.515   0.391  10.101     1
 4   3.538   0.099   0.113   0.515   0.174   0.793     0
 5   3.538   0.366   0.389   0.515   0.399  10.101     1
 6   3.538   0.368   0.389   0.515   0.389  10.117     1
 7   3.538   0.385   0.412   0.517   0.492  10.361     2
 8   3.538   0.099   0.111   0.515   0.174   0.778     0
 9   3.538   0.368   0.389   0.515   0.389  10.101     1
10   3.538   0.374   0.389   0.515   0.389  10.101     1
11   3.536   0.374   0.389   0.515   0.389  10.101     1
12   3.538   0.366   0.389   0.517   0.389  10.101     1
13   3.536   0.368   0.389   0.515   0.389  10.101     1
14   3.538   0.383   0.412   0.517   0.490  10.208     2
15   3.538   0.374   0.389   0.517   0.389  10.101     1

nate@haswell:~/git/libdivide$ gcc-4.8 -march=native -fstrict-aliasing -W -Wall -g -O3 -DLIBDIVIDE_USE_SSE2=1 -o benchmark libdivide_benchmark.c -lpthread libdivide_benchmark.c: In function ‘random_data’: libdivide_benchmark.c:791:19: warning: ignoring return value of ‘posix_memalign’, declared with attribute warn_unused_result [-Wunused-result] posix_memalign(&ptr, 16, multiple * ITERATIONS * sizeof(uint32_t)); ^ nate@haswell:~/git/libdivide$ benchmark

system scalar scl_us vector vec_us gener algo

 1   3.538   0.099   0.103   0.166   0.168   0.793     0
 2   3.536   0.099   0.103   0.166   0.168   0.793     0
 3   3.536   0.288   0.254   0.391   0.391  10.147     1
 4   3.536   0.101   0.103   0.166   0.166   0.793     0
 5   3.536   0.288   0.254   0.391   0.389  10.117     1
 6   3.536   0.286   0.254   0.391   0.391  10.117     1

Failure on line 651 Failure on line 652 Failure on line 651 Failure on line 652 Failure on line 651 Failure on line 652 7 3.538 0.324 0.298 0.486 0.486 10.300 2 8 3.538 0.099 0.103 0.166 0.168 0.778 0 9 3.536 0.288 0.254 0.391 0.391 10.117 1 10 3.538 0.288 0.254 0.391 0.391 10.117 1 11 3.538 0.286 0.254 0.391 0.389 10.117 1 12 3.536 0.288 0.254 0.391 0.391 10.101 1 13 3.538 0.286 0.254 0.391 0.389 10.132 1 Failure on line 651 Failure on line 652 Failure on line 651 Failure on line 652 Failure on line 651 Failure on line 652 14 3.536 0.324 0.298 0.486 0.486 10.315 2 15 3.538 0.288 0.254 0.391 0.391 10.117 1

nate@haswell:~/git/libdivide$ gcc-4.9 -march=native -fstrict-aliasing -W -Wall -g -O3 -DLIBDIVIDE_USE_SSE2=1 -o benchmark libdivide_benchmark.c -lpthread nate@haswell:~/git/libdivide$ benchmark

system scalar scl_us vector vec_us gener algo

 1   3.536   0.116   0.118   0.517   0.147   0.687     0
 2   3.536   0.103   0.105   0.561   0.134   0.687     0
 3   3.538   0.391   0.393   0.515   0.404   4.715     1
 4   3.538   0.099   0.101   0.517   0.132   0.687     0
 5   3.536   0.391   0.391   0.515   0.404   4.715     1
 6   3.538   0.391   0.391   0.517   0.395   4.715     1
 7   3.538   0.397   0.414   0.519   0.498   5.051     2
 8   3.538   0.099   0.101   0.517   0.132   0.687     0
 9   3.538   0.391   0.391   0.517   0.404   4.715     1
10   3.536   0.391   0.391   0.515   0.404   4.715     1
11   3.536   0.391   0.391   0.515   0.404   4.715     1
12   3.536   0.391   0.397   0.515   0.402   4.715     1
13   3.538   0.391   0.391   0.515   0.395   4.715     1
14   3.538   0.397   0.414   0.519   0.498   5.051     2
15   3.536   0.391   0.391   0.515   0.404   4.715     1

nate@haswell:~/git/libdivide$ clang-3.5 -march=native -fstrict-aliasing -W -Wall -g -O3 -DLIBDIVIDE_USE_SSE2=1 -o benchmark libdivide_benchmark.c -lpthread In file included from libdivide_benchmark.c:1: ./libdivide.h:290:29: warning: variable 'result' is uninitialized when used here [-Wuninitialized] result = _mm_cmpeq_epi8(result, result); //all 1s ^~ ./libdivide.h:289:5: note: variable 'result' is declared here __m128i result; //we don't care what its contents are ^ 1 warning generated. nate@haswell:~/git/libdivide$ benchmark

system scalar scl_us vector vec_us gener algo

 1   2.911   0.097   0.097   0.114   0.114   2.075     0
 2   2.935   0.097   0.097   0.116   0.114   2.075     0
 3   2.937   0.706   0.713   0.397   0.401  10.666     1
 4   2.918   0.097   0.099   0.114   0.114   2.060     0
 5   2.918   0.731   0.713   0.397   0.401  10.651     1
 6   2.918   0.732   0.715   0.399   0.399  10.666     1
 7   2.916   0.816   0.818   0.502   0.496  12.344     2
 8   2.911   0.097   0.097   0.114   0.114   2.060     0
 9   2.911   0.704   0.715   0.399   0.401  10.651     1
10   2.913   0.734   0.715   0.399   0.401  10.651     1
11   2.911   0.731   0.715   0.399   0.401  10.666     1
12   2.911   0.734   0.713   0.399   0.399  10.666     1
13   2.911   0.731   0.713   0.399   0.401  10.666     1
14   2.911   0.816   0.818   0.504   0.494  12.360     2
15   2.911   0.732   0.715   0.399   0.401  10.666     1

nate@haswell:~/git/libdivide$ icc-14 -march=native -fstrict-aliasing -W -Wall -g -O3 -DLIBDIVIDE_USE_SSE2=1 -o benchmark libdivide_benchmark.c -lpthread nate@haswell:~/git/libdivide$ benchmark

system scalar scl_us vector vec_us gener algo

 1   1.089   0.095   0.095   0.221   0.162   5.020     0
 2   1.089   0.095   0.095   0.221   0.162   5.005     0
 3   1.110   0.257   0.257   0.389   0.389   6.897     1
 4   1.089   0.095   0.095   0.221   0.162   5.005     0
 5   1.089   0.257   0.257   0.389   0.389   6.897     1
 6   1.110   0.257   0.257   0.389   0.389   6.897     1
 7   1.108   0.269   0.284   0.496   0.481   6.912     2
 8   1.089   0.095   0.095   0.221   0.162   5.005     0
 9   1.097   0.259   0.257   0.389   0.391   6.897     1
10   1.110   0.257   0.257   0.389   0.389   6.912     1
11   1.108   0.257   0.257   0.389   0.389   6.912     1
12   1.108   0.259   0.257   0.389   0.389   6.897     1
13   1.108   0.259   0.257   0.387   0.391   6.897     1
14   1.110   0.271   0.284   0.496   0.481   6.897     2
15   1.108   0.257   0.257   0.389   0.391   6.912     1
ridiculousfish commented 8 years ago

The benchmark errors in gcc 4.8 are a gcc bug. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61108.