shibatch / sleef

SIMD Library for Evaluating Elementary Functions, vectorized libm and DFT
https://sleef.org
Boost Software License 1.0
628 stars 126 forks source link

New test failures in 3.6 #518

Closed musicinmybrain closed 4 months ago

musicinmybrain commented 6 months ago

Testing Sleef 3.6 in Fedora Linux (Rawhide), with GCC 14.0.1, I see several regressions from 3.5.1:

x86_64

      1 - iut (Failed)

aarch64

      1 - iut (Failed)
     21 - gnuabi_compatibility_SVE (ILLEGAL)
     22 - gnuabi_compatibility_SVE_masked (ILLEGAL)
     28 - qiutsve (Failed)

ppc64le

The following tests FAILED:
      1 - iut (Failed)

s390x

      1 - iut (Failed)
      3 - iutyvxe (Failed)
      6 - iutyvxenofma (Failed)
      9 - iutyvxe2 (Failed)
     11 - iutvxe2nofma (Failed)
     12 - iutyvxe2nofma (Failed)

Note that I did not have the quad-precision library enabled for 3.5.1, but it is enabled here since it is no longer marked experimental in 3.6.

In addition, all of the failures reported in https://github.com/shibatch/sleef/issues/439 are still present, except that I no longer see iutyzvector2 or iutyzvector2nofma failing on s390x, and some of the test numbers are different.

blapie commented 5 months ago

Hi! Sorry for the late reply. Thanks a lot for testing and reporting the new failures in details! Maybe it could help if you could share your config (cmake options)?

I will have time to look into #439 now, we will consider the present issue once that is done.

How do you run the test btw? Do you use emulators or actual machines?

musicinmybrain commented 5 months ago

Hi! Sorry for the late reply. Thanks a lot for testing and reporting the new failures in details! Maybe it could help if you could share your config (cmake options)?

The %cmake spec-file macro that encodes Fedora’s “boilerplate” options currently expands to

  CFLAGS="${CFLAGS:--O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64   -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer }" ; export CFLAGS ; 
  CXXFLAGS="${CXXFLAGS:--O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64   -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer }" ; export CXXFLAGS ; 
  FFLAGS="${FFLAGS:--O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64   -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -I/usr/lib64/gfortran/modules }" ; export FFLAGS ; 
  FCFLAGS="${FCFLAGS:--O2 -flto=auto -ffat-lto-objects -fexceptions -g -grecord-gcc-switches -pipe -Wall -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -m64   -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -I/usr/lib64/gfortran/modules }" ; export FCFLAGS ; 
  VALAFLAGS="${VALAFLAGS:--g}" ; export VALAFLAGS ; 
  RUSTFLAGS="${RUSTFLAGS:--Copt-level=3 -Cdebuginfo=2 -Ccodegen-units=1 -Cstrip=none -Cforce-frame-pointers=yes -Clink-arg=-Wl,-z,relro -Clink-arg=-Wl,-z,now --cap-lints=warn}" ; export RUSTFLAGS ; 
  LDFLAGS="${LDFLAGS:--Wl,-z,relro -Wl,--as-needed  -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1  -Wl,--build-id=sha1  }" ; export LDFLAGS ; 
  LT_SYS_LIBRARY_PATH="${LT_SYS_LIBRARY_PATH:-/usr/lib64:}" ; export LT_SYS_LIBRARY_PATH ; 
  CC="${CC:-gcc}" ; export CC ; 
  CXX="${CXX:-g++}" ; export CXX 
  /usr/bin/cmake \
        -S "." \
        -B "redhat-linux-build" \
        -DCMAKE_C_FLAGS_RELEASE:STRING="-DNDEBUG" \
        -DCMAKE_CXX_FLAGS_RELEASE:STRING="-DNDEBUG" \
        -DCMAKE_Fortran_FLAGS_RELEASE:STRING="-DNDEBUG" \
        -DCMAKE_VERBOSE_MAKEFILE:BOOL=ON \
        -DCMAKE_INSTALL_DO_STRIP:BOOL=OFF \
        -DCMAKE_INSTALL_PREFIX:PATH=/usr \
        -DINCLUDE_INSTALL_DIR:PATH=/usr/include \
        -DLIB_INSTALL_DIR:PATH=/usr/lib64 \
        -DSYSCONF_INSTALL_DIR:PATH=/etc \
        -DSHARE_INSTALL_PREFIX:PATH=/usr/share \
%if "lib64" == "lib64" 
        -DLIB_SUFFIX=64 \
%endif 
        -DBUILD_SHARED_LIBS:BOOL=ON

To this, I have added:

    -GNinja \
    -DENFORCE_TESTER3:BOOL=TRUE \
    -DBUILD_INLINE_HEADERS:BOOL=%{?inline_enabled:TRUE}%{?!inline_enabled:FALSE} \
    -DBUILD_GNUABI_LIBS:BOOL=%{?gnuabi_enabled:TRUE}%{?!gnuabi_enabled:FALSE} \
    -DBUILD_DFT:BOOL=%{?with_dft:TRUE}%{?!with_dft:FALSE} \
    -DBUILD_QUAD:BOOL=%{?with_quad:TRUE}%{?!with_quad:FALSE}

I have inline headers turned off unless something in Fedora really them, since we strongly favor dynamic linking over static linking.

The gnuabi libraries are turned on where they are supported, per https://sleef.org/additional.xhtml#gnuabi, which is x86_64 and aarch64 (and i686 if we were still building for it).

The DFT libraries are disabled due to various issues, mentioned in https://github.com/shibatch/sleef/issues/214. My notes in the spec file say,

# We do not ship the DFT library since it has undiagnosed test failures on
# Fedora at -O2, and is not well-supported upstream. Additionally, it uses
# illegal instructions on ARM and s390x, at least on the Fedora build machines.
# See https://github.com/shibatch/sleef/issues/214.

The quad-precision libraries are disabled for 3.5.1 because they are documented as experimental, but I intend to build them for 3.6 since that’s no longer the case.

How do you run the test btw? Do you use emulators or actual machines?

These bug reports are based on “scratch builds” on real hardware. I don’t have interactive shell access on these machines, but I can submit as many “scratch builds” as I need to, so it’s possible to do experiments with real hardware.

I can also do local testing in emulation with qemu-user-static, which is pretty effective and usually (but not always) gives the same results as real hardware. The advantage is that I can drop into a shell and poke around in the emulated chroot.

musicinmybrain commented 5 months ago

It looks like the GCC fix from https://github.com/shibatch/sleef/issues/439#issuecomment-1999261139 also resolves all of these issues. I was able to build sleef 3.6 on all Fedora architectures except i686 using a patched version of GCC 14.0.1.

On i686, the build failed due to an issue with compiler flags:

cc1: error: unrecognized command-line option ‘-ffast-math -msse2 -mfpmath=sse’

I only tested i686 for completeness. I don’t care about it and don’t plan to build an i686 version in Fedora, since i686 is only for multilib on x86_64 and nothing would depend on it.


I’m testing the -ftrapping-math workaround now. It works on x86_64, but I need to verify it on the other architectures, too.

blapie commented 5 months ago

Great news!

On i686, the build failed due to an issue with compiler flags:

We cannot claim that we support i686 anymore for lack of testing, so I don't think this is an issue (at least until someone requires it).

musicinmybrain commented 5 months ago

It looks like -ftrapping-math is a usable workaround in general.


I do still see these test failures on aarch64:

The following tests FAILED:
     21 - gnuabi_compatibility_SVE (ILLEGAL)
     22 - gnuabi_compatibility_SVE_masked (ILLEGAL)
     28 - qiutsve (Failed)

I think the discrepancy is probably not because the GCC patch fixes these failures and -ftrapping-math doesn’t, but instead because the builder machine for the test in COPR with the GCC patch supported SVE

CPU info:
Architecture:                       aarch64
CPU op-mode(s):                     32-bit, 64-bit
Byte Order:                         Little Endian
CPU(s):                             4
On-line CPU(s) list:                0-3
Vendor ID:                          ARM
Model name:                         Neoverse-V1
Model:                              1
Thread(s) per core:                 1
Core(s) per socket:                 4
Socket(s):                          1
Stepping:                           r1p1
BogoMIPS:                           2100.00
Flags:                              fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm ssbs paca pacg dcpodp svei8mm svebf16 i8mm bf16 dgh rng

and the one used for the scratch build to test -ftrapping-math didn’t:

CPU info:
Architecture:                       aarch64
CPU op-mode(s):                     32-bit, 64-bit
Byte Order:                         Little Endian
CPU(s):                             12
On-line CPU(s) list:                0-11
Vendor ID:                          ARM
Model name:                         Neoverse-N1
Model:                              1
Thread(s) per core:                 1
Core(s) per socket:                 1
Socket(s):                          12
Stepping:                           r3p1
BogoMIPS:                           50.00
Flags:                              fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop asimddp ssbs

It makes sense that I cannot run the SVE tests on a machine that doesn’t have SVE support. I think that everything will still work on non-SVE machines as long as nothing uses the API functions that explicitly have sve in their names, correct? In that case, I just need to skip these tests, either unconditionally or based on something like checking /proc/cpuinfo for the builder.

musicinmybrain commented 5 months ago

It looks like -ftrapping-math is a usable workaround in general.

I was mistaken about this; see https://github.com/shibatch/sleef/issues/439#issuecomment-2004515522. Instead, Fedora’s gcc package now contains the fix.

musicinmybrain commented 4 months ago

I guess I’m going to close this, since the only remaining issue is the SVE tests being run unconditionally on aarch64 as described in https://github.com/shibatch/sleef/issues/518#issuecomment-2004420780.