Open armanbilge opened 2 years ago
Minimized code:
#include <stdio.h> #include <x86intrin.h> #include <float.h> #include <limits.h> #include <string.h> #define SLEEF_ALWAYS_INLINE __attribute__((always_inline)) #define SLEEF_INLINE static inline #define SLEEF_CONST const #include <sleefinline_avx2.h> int main(void) { __m256d x = _mm256_set1_pd(1.0); x = Sleef_expd4_u10avx2(x); printf("%f %f %f %f\n", x[0], x[1], x[2], x[3]); }
If I compile with gcc -mavx2 -mfma -o bug bug.c I get: 2.718282 2.718282 1.000000 1.000000
gcc -mavx2 -mfma -o bug bug.c
2.718282 2.718282 1.000000 1.000000
If I compile with gcc -mavx2 -mfma -O3 -o bug bug.c I get: 2.718282 2.718282 2.718282 2.718282
gcc -mavx2 -mfma -O3 -o bug bug.c
2.718282 2.718282 2.718282 2.718282
I didn't try other combinations of functions or -O levels. BTW this might only be a documentation "bug" :)
-O
Thanks for a fantastic library!
Minimized code:
If I compile with
gcc -mavx2 -mfma -o bug bug.c
I get:2.718282 2.718282 1.000000 1.000000
If I compile with
gcc -mavx2 -mfma -O3 -o bug bug.c
I get:2.718282 2.718282 2.718282 2.718282
I didn't try other combinations of functions or
-O
levels. BTW this might only be a documentation "bug" :)Thanks for a fantastic library!