OpenMathLib / OpenBLAS

OpenBLAS is an optimized BLAS library based on GotoBLAS2 1.13 BSD version.
http://www.openblas.net
BSD 3-Clause "New" or "Revised" License
6.28k stars 1.49k forks source link

v.0.3.0 test fail on power9 #1571

Closed omula closed 6 years ago

omula commented 6 years ago

I have tried to compile latest version on power9 and it failed on one of the tests. TEST 17/22 rot:csrot_inc_0 [FAIL] ERR: test_rot.c:109 expected -2.148e-01, got 3.125e-01 (diff -5.273e-01, tol 1.000e-04)

I have launched the compilation with: make -j 160 TARGET=POWER8 BINARY=64 USE_MASS=1 I modified Makefile.power so that -mcpu and -mtune equals to power9. MASSPATH = /opt/ibm/xlmass/8.1.6/lib EXTRALIB += -L$(MASSPATH) -lmass -lmassvp9 -lmass_simdp9

martin-frbg commented 6 years ago

Weird. I had understood from #1469 that this test passes on power8. Can you check if setting CSROTKERNEL=zrot.c in kernel/power/KERNEL.POWER8 solves it ? (I think it should, as the code path for incx=0 there is the same as in the generic implementation that is known to work. Probably the build system is picking up the old zrot.S instead).

quickwritereader commented 6 years ago

try uncomment line @martin-frbg pointed.

One could add crot.c based on srot.c if srot passes test. As crot and srot mikrokernels are same and just need c implementation changes. I will try to do it tomorrow.

(Afaik power8 single precision kernels issue is still open. I am thinking to start that next month and at the same time I am intending to improve initial double prec kernels.)

omula commented 6 years ago

It is still failing with the same error after adding CSROTKERNEL=zrot.c in kernel/power/KERNEL.POWER8.

martin-frbg commented 6 years ago

Did you do a full rebuild after make clean ?

omula commented 6 years ago

Yes, I verfied twice.

quickwritereader commented 6 years ago

Generic one is CROTKERNEL = ../arm/zrot.c

martin-frbg commented 6 years ago

Wait, did i write CSROTKERNEL above :frowning: ?? CROTKERNEL was what I meant (and if you compare the inc=0 section of zrot.c to the arm/zrot.c they are identical, hence my comment above that the c code is known to work.

omula commented 6 years ago

Now it worked :)

quickwritereader commented 6 years ago

frankly I just searched and could not find on which place it compares incx and incy against 0.

martin-frbg commented 6 years ago

It doesn't - I should have written the "not inc=1 section" perhaps (where inc=1 gets optimized with assembly kernel, but the "all other values of inc_x,inc_y" section is identical in the two files). Basically what the utest checks is that the code does not do an immediate exit when there is "nothing to do"

quickwritereader commented 6 years ago

Yes it falls back. Tomorrow if I have time I will replace it with modified srot. but for rest power8 single precision I am intending to start next month offtopic: @martin-frbg are you professor of mathematics or data scientist ?

martin-frbg commented 6 years ago

offtopic - not a professor and certainly not mathematics. Computational chemistry degree and years of experience reading and writing ugly code that mostly gets the job done.