OpenMathLib / OpenBLAS

OpenBLAS is an optimized BLAS library based on GotoBLAS2 1.13 BSD version.
http://www.openblas.net
BSD 3-Clause "New" or "Revised" License
6.31k stars 1.49k forks source link

make TARGET=PPCG4 results in illegal instruction error during tests #4901

Open hpckurt opened 1 week ago

hpckurt commented 1 week ago

I've attempted to build Openblas for powerpc using make TARGET=PPCG4 CC=gcc-11 FC=gfortran-11 I've tried building multiple different versions now, resulting in the same error (so I don't believe it's a regression). Currently running debian sid.

It compiles fine, but errors out in the tests.

It errors out immediately after OPENBLAS_NUM_THREADS=1 OMP_NUM_THREADS=1 ./test_sbgemm > SBBLAT3.SUMM

The final test it passes is the following:

OMP_NUM_THREADS=2 ./zblat1
 Complex BLAS Test Program Results

 Test of subprogram number  1            ZDOTC 
                                    ----- PASS -----

 Test of subprogram number  2            ZDOTU 
                                    ----- PASS -----

 Test of subprogram number  3            ZAXPY 
                                    ----- PASS -----

 Test of subprogram number  4            ZCOPY 
                                    ----- PASS -----

 Test of subprogram number  5            ZSWAP 
                                    ----- PASS -----

 Test of subprogram number  6            DZNRM2
                                    ----- PASS -----

 Test of subprogram number  7            DZASUM
                                    ----- PASS -----

 Test of subprogram number  8            ZSCAL 
                                    ----- PASS -----

 Test of subprogram number  9            ZDSCAL
                                    ----- PASS -----

 Test of subprogram number 10            IZAMAX
                                    ----- PASS -----
martin-frbg commented 1 week ago

This is a bit strange as the default SBGEMM kernel is written in plain C - suggesting that this may be gcc-11 picking instructions not suitable for this cpu. Can you try a more recent compiler, or check if it builds with BUILD_BFLOAT16=0 ?

martin-frbg commented 1 week ago

btw is that an actual G4, qemu or a more recent power8/9/10 with TARGET=PPCG4 ?

hpckurt commented 1 week ago

It's not actually a G4, I'm trying to compile OpenBLAS on the Wii U, so the Espresso CPU. The G4 was the closest target (from what I can tell) since it's a 32 bit ppc cpu.

EDIT: the illegal instructions error also happens in qemu

martin-frbg commented 1 week ago

I don't know much about these older cpus (where support is carried over from the original GotoBLAS), but I guess the difference could be Altivec instruction support. Maybe the PPC970 target would fare better ?

hpckurt commented 1 week ago

The Espresso actually doesn't have altivec support, which is one of the reasons why the G4 target was chosen.

Since it's broken on qemu as well, does that potentially mean the G4 target is broken in general?

martin-frbg commented 1 week ago

Possibly, yes - unless there happens to be a problem with qemu emulation of that old cpu. I do not have (and never had) this old hardware, there is no machine of this kind available through the GCC Compile Farm or any of the usual CI providers (as far as I know)

martin-frbg commented 1 week ago

... but as stated above, SBGEMM itself should only use plain C code, so the main question would be if all the other tests involving more traditional data types (or a build with BUILD_BFLOAT16 set to zero) also runs into illegal instructions

hpckurt commented 1 week ago

So what would you recommend at this point? Try an older gcc version? (For what it's worth, I have the same error using clang)

EDIT: I'll try a new build with bfloat16=0

hpckurt commented 1 week ago

Building without bfloat16 results in the same error. Same thing with gcc 4.8.

OPENBLAS_NUM_THREADS=1 OMP_NUM_THREADS=1 ./sblat3 < ./sblat3.dat

Program received signal SIGILL: Illegal instruction.

Backtrace for this error:
#0  0xFF12697
qemu: uncaught target signal 4 (Illegal instruction) - core dumped
Illegal instruction
make[1]: *** [level3] Error 132
make[1]: *** Waiting for unfinished jobs....
OPENBLAS_NUM_THREADS=1 OMP_NUM_THREADS=1 ./dblat2 < ./dblat2.dat
OPENBLAS_NUM_THREADS=1 OMP_NUM_THREADS=1 ./cblat2 < ./cblat2.dat
OPENBLAS_NUM_THREADS=1 OMP_NUM_THREADS=1 ./zblat2 < ./zblat2.dat
rm -f ?BLAT2.SUMM
OMP_NUM_THREADS=2 ./sblat2 < ./sblat2.dat
OMP_NUM_THREADS=2 ./dblat2 < ./dblat2.dat
OMP_NUM_THREADS=2 ./cblat2 < ./cblat2.dat
OMP_NUM_THREADS=2 ./zblat2 < ./zblat2.dat
make[1]: Leaving directory `/root/OpenBLAS-0.3.28/test'
make: *** [tests] Error 2

There's something about the sblat3 test it doesn't like.

If you need a build server, I can always host one for you.