Closed cyrilbouvier closed 3 years ago
Commit 1569db4 did fix the homebrew problem with the OS X job but revealed a failure for test-fger
The problem does not seem to be specific to OS X, as I was able to reproduce it. I had one failed run of test-fger
on more than 100 runs, with the following error message:
Checking Modular_implem<int64_t, uint64_t, uint64_t> modulo 165033971... FAIL
a :132720555
m :44, n : 1
incx :1, incy : 1, ldC : 1
Error C[0,0]=18463374 D[0,0]=24729322
Error C[1,0]=54479037 D[1,0]=144408618
Error C[2,0]=67881147 D[2,0]=99501009
Error C[3,0]=84658322 D[3,0]=141056375
Error C[4,0]=35823758 D[4,0]=44270387
Error C[5,0]=35369714 D[5,0]=15002300
Error C[6,0]=114908161 D[6,0]=42428348
Error C[7,0]=53084340 D[7,0]=15942759
Error C[8,0]=94996110 D[8,0]=104641692
Error C[9,0]=39583949 D[9,0]=143044872
Error C[10,0]=140921121 D[10,0]=136917392
Error C[11,0]=43433855 D[11,0]=74462683
Error C[12,0]=143759473 D[12,0]=89899069
Error C[13,0]=96683823 D[13,0]=75362525
Error C[14,0]=28275651 D[14,0]=78298271
Error C[15,0]=149033254 D[15,0]=138700481
Error C[16,0]=126252883 D[16,0]=66669534
Error C[17,0]=28226674 D[17,0]=51199424
Error C[18,0]=81191980 D[18,0]=144124893
Error C[19,0]=69444028 D[19,0]=61800979
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
FAILED
Edit: I was able to find a seed that trigger the error:
./test-fger -s 355057717
Edit 2: It seems this issue is already under investigation: see issue #307 and PR #289
A more exhaustive list of intrinsics not recognized by gcc 4.9:
_mm512_castps512_ps256
: see original post_mm512_mullo_epi64
: used in mullo for Simd512\_mm512_castpd256_pd512
: used in blendv and hadd for Simd512\_mm512_castpd512_pd256
: used in blendv and hadd for Simd512\_mm512_castps256_ps512
: used in blendv and hadd for Simd512\_mm512_castps_pd
: used in hadd for Simd512\_mm512_castsi512_pd
: used in eq, lesser, lesser_eq, greater, greater_eq for Simd512\_mm512_castsi512_ps
: used in eq, lesser, lesser_eq, greater, greater_eq for Simd512\_mm512_cmpgt_epu64_mask
: used in greater for Simd512\_mm512_cvtepi64_pd
: used in mod for Simd512\_mm512_cvtpd_epi64
: used in mod for Simd512\_mm512_insertf32x8
: used in hadd for for Simd512\OK to merge this PR as is in order to fix the hombrew problem. Regarding the support for 4.9 as minimal gcc version, it seems that the numerous avx512 undefined intrinsics is a strong enough argument to move to gcc-5 as minimal supported gcc: we should not waste our time writing workarounds. Still the one you propose deserves its own PR, if it improves the current implementation. The fger bug is already mentionned in #307 and need our attention!
LGTM . Merging
The Goal of the PR is to fix Travis CI. Currently, two out of the four jobs fail: https://travis-ci.org/github/linbox-team/fflas-ffpack/builds/741688422
First, commit 1569db4c51 should fix the homebrew problem with the OS X job (a similar changes was made in Givaro a few months ago).
The other problem is with the job using gcc 4.9 (see issue #313). To fix this issue, I see two possibilities:
_mm512_castps512_ps256
, I was able to remove it by rewriting the blendv method insimd512_float.inl
:The advantage of this new code is that it is faster (only two instructions, as the cast is here for compilation only). The main drawback is that
_mm512_movepi32_mask
requires AVX512DQ and not just AVX512L. But , according to issue #312 , other parts of the simd code already silently require AVX512DQ.More work is needed to remove the other mising intrinsics if we go with option 2.