NGSolve / netgen

https://ngsolve.org
GNU Lesser General Public License v2.1
298 stars 131 forks source link

netgen-6.2.2203 fails to build on arches which fall back to simd_generic.hpp (i.e. ppc64le, s390x) #124

Open manisandro opened 2 years ago

manisandro commented 2 years ago

I'm attempting to build netgen-6.2.2203 for Fedora [1]. I'm currently hitting a build failure on arches which fall back to simd_generic.hpp, specifically:

In file included from /builddir/build/BUILD/netgen-6.2.2203/libsrc/core/simd.hpp:12,
                from /builddir/build/BUILD/netgen-6.2.2203/libsrc/core/ngcore.hpp:16:
/builddir/build/BUILD/netgen-6.2.2203/libsrc/core/simd_generic.hpp: In instantiation of 'ngcore::SIMD<double, N>::SIMD(const T ...) [with T = {ngcore::SIMD<double, 1>, ngcore::SIMD<double, 1>}; int N = 4]':
/builddir/build/BUILD/netgen-6.2.2203/libsrc/core/simd.hpp:56:42:   required from here
/builddir/build/BUILD/netgen-6.2.2203/libsrc/core/simd_generic.hpp:319:56: error: cannot convert 'const ngcore::SIMD<double, 1>' to 'double' in initialization
319 |     : lo(detail::array_range<N1>(std::array<double, N>{vals...}, 0)),
    |                                                        ^~~~
    |                                                        |
    |                                                        const ngcore::SIMD<double, 1>
/builddir/build/BUILD/netgen-6.2.2203/libsrc/core/simd_generic.hpp:319:56: error: cannot convert 'const ngcore::SIMD<double, 1>' to 'double' in initialization
319 |     : lo(detail::array_range<N1>(std::array<double, N>{vals...}, 0)),
    |                                                        ^~~~
    |                                                        |
    |                                                        const ngcore::SIMD<double, 1>
/builddir/build/BUILD/netgen-6.2.2203/libsrc/core/simd_generic.hpp:320:58: error: cannot convert 'const ngcore::SIMD<double, 1>' to 'double' in initialization
320 |       high(detail::array_range<N2>(std::array<double, N>{vals...}, N1))
    |                                                          ^~~~
    |                                                          |
    |                                                          const ngcore::SIMD<double, 1>
/builddir/build/BUILD/netgen-6.2.2203/libsrc/core/simd_generic.hpp:320:58: error: cannot convert 'const ngcore::SIMD<double, 1>' to 'double' in initialization
320 |       high(detail::array_range<N2>(std::array<double, N>{vals...}, N1))
    |                                                          ^~~~
    |                                                          |
    |                                                          const ngcore::SIMD<double, 1>
/builddir/build/BUILD/netgen-6.2.2203/libsrc/core/simd_generic.hpp:322:38: error: static assertion failed: wrong number of arguments
322 |         static_assert(sizeof...(vals)==N, "wrong number of arguments");
    |                       ~~~~~~~~~~~~~~~^~~
/builddir/build/BUILD/netgen-6.2.2203/libsrc/core/simd_generic.hpp:322:38: note: the comparison reduces to '(2 == 4)'
/builddir/build/BUILD/netgen-6.2.2203/libsrc/core/simd_generic.hpp: In instantiation of 'ngcore::SIMD<double, N>::SIMD(const T ...) [with T = {ngcore::SIMD<double, 3>, ngcore::SIMD<double, 3>}; int N = 4]':
/builddir/build/BUILD/netgen-6.2.2203/libsrc/core/simd.hpp:58:42:   required from here
/builddir/build/BUILD/netgen-6.2.2203/libsrc/core/simd_generic.hpp:319:56: error: cannot convert 'const ngcore::SIMD<double, 3>' to 'double' in initialization
319 |     : lo(detail::array_range<N1>(std::array<double, N>{vals...}, 0)),
    |                                                        ^~~~
    |                                                        |
    |                                                        const ngcore::SIMD<double, 3>
/builddir/build/BUILD/netgen-6.2.2203/libsrc/core/simd_generic.hpp:319:56: error: cannot convert 'const ngcore::SIMD<double, 3>' to 'double' in initialization
319 |     : lo(detail::array_range<N1>(std::array<double, N>{vals...}, 0)),
    |                                                        ^~~~
    |                                                        |
    |                                                        const ngcore::SIMD<double, 3>
/builddir/build/BUILD/netgen-6.2.2203/libsrc/core/simd_generic.hpp:320:58: error: cannot convert 'const ngcore::SIMD<double, 3>' to 'double' in initialization
320 |       high(detail::array_range<N2>(std::array<double, N>{vals...}, N1))
    |                                                          ^~~~
    |                                                          |
    |                                                          const ngcore::SIMD<double, 3>
/builddir/build/BUILD/netgen-6.2.2203/libsrc/core/simd_generic.hpp:320:58: error: cannot convert 'const ngcore::SIMD<double, 3>' to 'double' in initialization
320 |       high(detail::array_range<N2>(std::array<double, N>{vals...}, N1))
    |                                                          ^~~~
    |                                                          |
    |                                                          const ngcore::SIMD<double, 3>
/builddir/build/BUILD/netgen-6.2.2203/libsrc/core/simd_generic.hpp:322:38: error: static assertion failed: wrong number of arguments
322 |         static_assert(sizeof...(vals)==N, "wrong number of arguments");
    |                       ~~~~~~~~~~~~~~~^~~
/builddir/build/BUILD/netgen-6.2.2203/libsrc/core/simd_generic.hpp:322:38: note: the comparison reduces to '(2 == 4)'

See [1] for full build logs on failing arches.

[1] https://koji.fedoraproject.org/koji/taskinfo?taskID=87307345

StefanBruens commented 2 years ago

I think the generic part of the SIMD implementatiion is quite broken, but is "accidentally" correct for architectures with a SIMD width of at least 128 bit (e.g. SSE) and vector types up to 256 bit (4 x double).

The SIMD implementation models wide types (width N) by splitting these into a "Lo" part with native width (N1), and a "Hi" part with the remainder (width N2), and does it iteratively until the last "Hi" element also fits into the native type. I.e. N == N1 + N2.

Unfortunately, the Unpack implementaion gets this wrong for the generic case:
https://github.com/NGSolve/netgen/blob/bdc738f87e8e4191de4552f4931ae31a6f526f41/libsrc/core/simd_generic.hpp#L703-L723

The last branch (≃ N > 2) Unpack should return a tuple<SIMD<double, 2*N1>, SIMD<double, 2*N1>, ...>, the tuple should have N / N1 elements. For the (N == 4, N1 == N2 == 2) case, this matches the current implementation by chance.

drew-parsons commented 1 year ago

The xsimd project has put some work into making their header library compatible with unsupported (generic) architectures. Would it be useful to refactor netgen's SIMD to outsource it to xsimd? (they still have some work to do in https://github.com/xtensor-stack/xsimd/issues/954)

https://github.com/xtensor-stack/xsimd