Closed svillemot closed 2 years ago
I can reproduce this issue by mips64el qemu.
Oops, I wonder what happened here. Almost looks as if the sense of the conditional is inverted (but more likely the right code inserted in the wrong location)
Changing the bc1t to bc1f does appear to "fix" this, but I have not yet given a single thought to the logic, nor checked what it does to the LAPACK testsuite
Compiling the latest code on the 3A4000 with Loongnix GNU/Linux 20 RC3 presents the following problems:
/usr/include/mips64el-linux-gnuabi64/gnu/stubs.h:41:11: 致命错误:gnu/stubs-n64_hard_2008.h:没有那个文件或目录
nclude <gnu/stubs-n64_hard_2008.h>
^~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from /usr/include/features.h:448, from /usr/include/mips64el-linux-gnuabi64/bits/libc-header-start.h:33, from /usr/include/stdio.h:27, from axpy.c:39: /usr/include/mips64el-linux-gnuabi64/gnu/stubs.h:41:11: 致命错误:gnu/stubs-n64_hard_2008.h:没有那个文件或目录
include <gnu/stubs-n64_hard_2008.h>
^~~~~~~~~~~~~~~~~~~~~~~~~~~
编译中断。 编译中断。 In file included from /usr/include/features.h:448, from /usr/include/mips64el-linux-gnuabi64/bits/libc-header-start.h:33, from /usr/include/stdio.h:27, from copy.c:39: /usr/include/mips64el-linux-gnuabi64/gnu/stubs.h:41:11: 致命错误:gnu/stubs-n64_hard_2008.h:没有那个文件或目录
include <gnu/stubs-n64_hard_2008.h>
^~~~~~~~~~~~~~~~~~~~~~~~~~~
编译中断。 make[1]: [Makefile:822:saxpy.o] 错误 1 make[1]: 正在等待未完成的任务.... make[1]: [Makefile:849:sscal.o] 错误 1 make[1]: [Makefile:876:scopy.o] 错误 1 In file included from /usr/include/features.h:448, from /usr/include/mips64el-linux-gnuabi64/bits/libc-header-start.h:33, from /usr/include/stdio.h:27, from swap.c:39: /usr/include/mips64el-linux-gnuabi64/gnu/stubs.h:41:11: 致命错误:gnu/stubs-n64_hard_2008.h:没有那个文件或目录
include <gnu/stubs-n64_hard_2008.h>
^~~~~~~~~~~~~~~~~~~~~~~~~~~
编译中断。 make[1]: *** [Makefile:894:sswap.o] 错误 1
So I reset to cc4b1d temporarily, all the utest passed:
TEST 1/35 max:smax_zero [OK] TEST 2/35 max:dmax_positive [OK] TEST 3/35 max:smax_negative [OK] TEST 4/35 min:smin_zero [OK] TEST 5/35 min:dmin_positive [OK] TEST 6/35 min:smin_negative [OK] TEST 7/35 amax:damax [OK] TEST 8/35 amax:samax [OK] TEST 9/35 ismax:negative_step_2 [OK] TEST 10/35 ismax:positive_step_2 [OK] TEST 11/35 ismin:negative_step_2 [OK] TEST 12/35 ismin:positive_step_2 [OK] TEST 13/35 drotmg:drotmg_D1_big_D2_big_flag_zero [OK] TEST 14/35 drotmg:rotmg_D1eqD2_X1eqX2 [OK] TEST 15/35 drotmg:rotmg_issue1452 [OK] TEST 16/35 drotmg:rotmg [OK] TEST 17/35 axpy:caxpy_inc_0 [OK] TEST 18/35 axpy:saxpy_inc_0 [OK] TEST 19/35 axpy:zaxpy_inc_0 [OK] TEST 20/35 axpy:daxpy_inc_0 [OK] TEST 21/35 zdotu:zdotu_offset_1 [OK] TEST 22/35 zdotu:zdotu_n_1 [OK] TEST 23/35 dsdot:dsdot_n_1 [OK] TEST 24/35 swap:cswap_inc_0 [OK] TEST 25/35 swap:sswap_inc_0 [OK] TEST 26/35 swap:zswap_inc_0 [OK] TEST 27/35 swap:dswap_inc_0 [OK] TEST 28/35 rot:csrot_inc_0 [OK] TEST 29/35 rot:srot_inc_0 [OK] TEST 30/35 rot:zdrot_inc_0 [OK] TEST 31/35 rot:drot_inc_0 [OK] TEST 32/35 dnrm2:dnrm2_tiny [OK] TEST 33/35 dnrm2:dnrm2_inf [OK] TEST 34/35 fork:safety [OK] TEST 35/35 fork:safety_after_fork_in_parent [OK] RESULTS: 35 tests (35 ok, 0 failed, 0 skipped) ran in 530 ms
Here, reverting cce4b1d9562d60b875982ce9587c68fbac666ec8 fixes dnrm2_inf
but not dnrm2_tiny
, as explained above.
Sorry It's my fault. cce4b1d only fixes dnrm2_tiny on Loongson's mips64el machine, mips64el machines from other manufacturers are still failed. Strangely, cce4b1d doesn't cause extra dnrm2_inf failed when I using mips64el qemu.
TEST 28/35 rot:csrot_inc_0 [OK] TEST 29/35 rot:srot_inc_0 [OK] TEST 30/35 rot:zdrot_inc_0 [OK] TEST 31/35 rot:drot_inc_0 [OK] TEST 32/35 dnrm2:dnrm2_tiny [FAIL] ERR: test_dnrm2.c:65 expected 0.000e+00, got inf (diff -inf, tol 1.000e-13) TEST 33/35 dnrm2:dnrm2_inf [OK] TEST 34/35 fork:safety [OK] TEST 35/35 fork:safety_after_fork_in_parent [OK] RESULTS: 35 tests (34 ok, 1 failed, 0 skipped) ran in 39930 ms
Thanks @XiWeiGu, I confirm that your latest commit fixes the issue for me.
Im still encountering the error:
TEST 32/36 dnrm2:dnrm2_inf [OK] TEST 33/36 dnrm2:dnrm2_tiny [FAIL] ERR: test_dnrm2.c:65 expected 0.000e+00, got inf (diff -inf, tol 1.000e-13) TEST 34/36 potrf:bug_695 [OK] TEST 35/36 potrf:smoketest_trivial [OK] TEST 36/36 kernel_regress:skx_avx [OK] RESULTS: 36 tests (35 ok, 1 failed, 0 skipped) ran in 8 ms make[1]: [run_test] Error 1 make: [tests] Error 2
I notice that even I changed MTC1 to MTC as suggested in https://github.com/xianyi/OpenBLAS/pull/3763/commits/365936ae1b1dfa2f50b3e65c68ae95babc6f2af2
whenever I run extras/install_openblas.sh
, the TEST 33/36 still fails and the MTC reverts to MTC1.
where does your extras/install_openblas.sh come from? From your description it looks as if that overwrites everything with a fresh download of an older, unfixed version
where does your extras/install_openblas.sh come from? From your description it looks as if that overwrites everything with a fresh download of an older, unfixed version
I got it here: https://github.com/kaldi-asr/kaldi I was performing the make after I had git clone it.
Well, unless you changed the version variable at the top of the script it will download a release version from almost two years ago. No wonder that it brings this problem (and doubtlessly several others) back
Actually I updated the extras/install_openblas.sh
from this:
OPENBLAS_VERSION=0.3.13
WGET=${WGET:-wget}
set -e
if ! command -v gfortran 2>/dev/null; then echo "$0: gfortran is not installed. Please install it, e.g. by:" echo " apt-get install gfortran" echo "(if on Debian or Ubuntu), or:" echo " yum install gcc-gfortran" echo "(if on RedHat/CentOS). On a Mac, if brew is installed, it's:" echo " brew install gfortran" exit 1 fi
tarball=OpenBLAS-$OPENBLAS_VERSION.tar.gz
rm -rf xianyi-OpenBLAS- OpenBLAS OpenBLAS-.tar.gz
if [ -d "$DOWNLOAD_DIR" ]; then cp -p "$DOWNLOAD_DIR/$tarball" . else url=$($WGET -qO- "https://api.github.com/repos/xianyi/OpenBLAS/releases/tags/v${OPENBLAS_VERSION}" | python -c 'import sys,json;print(json.load(sys.stdin)["tarball_url"])') test -n "$url" $WGET -t3 -nv -O $tarball "$url" fi
tar xzf $tarball mv xianyi-OpenBLAS-* OpenBLAS
make PREFIX=$(pwd)/OpenBLAS/install USE_LOCKING=1 USE_THREAD=0 -C OpenBLAS all install if [ $? -eq 0 ]; then echo "OpenBLAS is installed successfully." rm $tarball fi
to this:
OPENBLAS_VERSION=0.3.21
WGET=${WGET:-wget}
set -e
if ! command -v gfortran 2>/dev/null; then echo "$0: gfortran is not installed. Please install it, e.g. by:" echo " apt-get install gfortran" echo "(if on Debian or Ubuntu), or:" echo " yum install gcc-gfortran" echo "(if on RedHat/CentOS). On a Mac, if brew is installed, it's:" echo " brew install gfortran" exit 1 fi
tarball=OpenBLAS-$OPENBLAS_VERSION.tar.gz
rm -rf xianyi-OpenBLAS- OpenBLAS OpenBLAS-.tar.gz
if [ -d "$DOWNLOAD_DIR" ]; then cp -p "$DOWNLOAD_DIR/$tarball" . else url=$($WGET -qO- "https://api.github.com/repos/xianyi/OpenBLAS/releases/tags/v${OPENBLAS_VERSION}" | python3 -c 'import sys,json;print(json.load(sys.stdin)["tarball_url"])') test -n "$url" $WGET -t3 -nv -O $tarball "$url" fi
tar xzf $tarball mv xianyi-OpenBLAS-* OpenBLAS
make PREFIX=$(pwd)/OpenBLAS/install USE_LOCKING=1 USE_THREAD=0 -C OpenBLAS all install if [ $? -eq 0 ]; then echo "OpenBLAS is installed successfully." rm $tarball fi
by the way Im using MacOS 12.5.1
Ok, so at least you are downloading/unpacking 0.3.21 each time before you build - however the fix was only added after 0.3.21 was released. Suggest you either use git clone
to fetch the current develop
branch of OpenBLAS instead of the last release,
or you just run the make...
line from the script instead of the full script after manually patching the MSR/MSR1 line
Im still encountering the error :(
I already performed git clone https://github.com/xianyi/OpenBLAS.git
to get the "latest files" then what I did next is performed the command make PREFIX=$(pwd)/OpenBLAS/install USE_LOCKING=1 USE_THREAD=0 -C OpenBLAS all install
line from extras/install_openblas.sh
Results: TEST 32/36 dnrm2:dnrm2_inf [OK] TEST 33/36 dnrm2:dnrm2_tiny [FAIL] ERR: test_dnrm2.c:65 expected 0.000e+00, got inf (diff -inf, tol 1.000e-13) TEST 34/36 potrf:bug_695 [OK] TEST 35/36 potrf:smoketest_trivial [OK] TEST 36/36 kernel_regress:skx_avx [OK] RESULTS: 36 tests (35 ok, 1 failed, 0 skipped) ran in 2 ms make[1]: [run_test] Error 1 make: [tests] Error 2
It is now resolved what I did:
git clone https://github.com/xianyi/OpenBLAS.git
cd OpenBLAS
make
make PREFIX=install install
In OpenBLAS 0.3.21, on mips64, the Debian package fails to build because there are test failures in the utests for
dnrm2_tiny
anddnrm2_inf
:(see https://buildd.debian.org/status/fetch.php?pkg=openblas&arch=mips64el&ver=0.3.21%2Bds-1&stamp=1662336716&raw=0 for the full build log).
I noticed that commit cce4b1d9562d60b875982ce9587c68fbac666ec8 by @XiWeiGu was supposed to fix
dnrm2_tiny
.Interestingly, if I revert that very commit, then it fixes
dnrm2_inf
:So it seems that the logic of commit cce4b1d9562d60b875982ce9587c68fbac666ec8 is wrong, and that instead of fixing
dnrm2_tiny
, it brokednrm2_inf
.