OpenMathLib / OpenBLAS

OpenBLAS is an optimized BLAS library based on GotoBLAS2 1.13 BSD version.
http://www.openblas.net
BSD 3-Clause "New" or "Revised" License
6.32k stars 1.49k forks source link

Test kernel_regress:skx_avx fails on RISC-V platform #4446

Closed leavelet closed 2 months ago

leavelet commented 8 months ago

Environment:

OpenBLAS version: release 0.3.26 OS: revyos CPU: Sophgo sg2042, RISC-V rv64imafdc with rvv 0.71 Compiler: g++ 10.4, THead version. https://github.com/revyos/gcc/tree/revyos-gcc10.4-thead-dev Compile command: make HOSTCC=gcc-10 TARGET=C910V CC=riscv64-linux-gnu-gcc-10 FC=riscv64-linux-gnu-gfortran-10 -j 64

Error log:

TEST 38/40 kernel_regress:skx_avx [FAIL]
  ERR: test_kernel_regress.c:50  expected 0.000e+00, got 2.719e+04 (diff -2.719e+04, tol 1.000e-10)

By the way, the risc-v branch stuck on the line below

OPENBLAS_NUM_THREADS=2 ./cblat3 < ./cblat3.dat
leavelet commented 8 months ago

@RevySR

martin-frbg commented 8 months ago

kernel_regress:skx_avx is DGEMM, maybe we should rename it as its history as an AVX512 bug in the SkylakeX kernel is irrelevant today...

in CI it works with a different vendor toolchain based on GCC 10.2 (see .github/workflow/c910v.yml for the URL), but of course the tests there use only qemu instead of the actual hardware

leavelet commented 8 months ago

Since the CI with GCC 10.2 works fine, maybe it is a vendor problem. I shall work with Revy to resolve it.

martin-frbg commented 8 months ago

Any updates on this ? I've since merged the risc-v branch as I could not reproduce the problems in CI or local qemu, but I lack real C910V hardware at the moment.

leavelet commented 8 months ago

The GEMM issue is fixed in #4454. We have found another issue in kernel/riscv64/nrm2_vector.c, which hasn't been fixed yet. Keeping this issue open until we fix the nrm2 issue or closing it to open a new one both work fine; I'm not sure which one is better.

martin-frbg commented 8 months ago

thanks. can keep this one open for simplicity (unless you expect this to take long, in which case opening a new issue with appropriate title might help others find it faster). annoying that it seems to depend so much on compiler version, or qemu vs actual hardware

martin-frbg commented 3 months ago

I cannot reproduce either of these issues on MilkV Pioneer with current develop and a thead gcc built from the current state of their source repository.