lattice / quda

QUDA is a library for performing calculations in lattice QCD on GPUs.
https://lattice.github.io/quda
Other
279 stars 94 forks source link

Summit ppc64 build fails on develop branch with gcc 10.2 #1254

Open james-simone opened 2 years ago

james-simone commented 2 years ago

The ppc64 gcc 10.2 compiler is not happy with Eigen v3.4.0. I expect this has been an issue since commit b640c1f44b8 "bump Eigen to 3.4.0".

[ 31%] Building CXX object lib/CMakeFiles/quda_cpp.dir/eig_iram.cpp.o In file included from /ccs/home/simonej/ks_rhmc/build_quda/_deps/eigen-src/Eigen/Core:19, from /ccs/home/simonej/ks_rhmc/build_quda/_deps/eigen-src/Eigen/Eigenvalues:11, from /ccs/home/simonej/ks_rhmc/quda_git/lib/../include/eigen_helper.h:12, from /ccs/home/simonej/ks_rhmc/quda_git/lib/eig_iram.cpp:15: /ccs/home/simonej/ks_rhmc/build_quda/_deps/eigen-src/Eigen/src/Core/arch/AltiVec/MatrixProductMMA.h: In instantiation of 'void Eigen::internal::gemm_complex_unrolled_MMA_iteration(const DataMapper&, const Scalar, const Scalar, Index, Index, Index, Index, Index&, Index, const Packet&, const Packet&) [with int unroll_factor = 4; Scalar = double; Packet = vector(2) double; Packetc = Eigen::internal::Packet1cd; RhsPacket = Eigen::internal::PacketBlock<vector(2) double, 2>; DataMapper = Eigen::internal::blas_data_mapper<std::complex, long int, 0, 0, 1>; Index = long int; Index accRows = 4; Index accCols = 2; bool ConjugateLhs = false; bool ConjugateRhs = false; bool LhsIsReal = false; bool RhsIsReal = false]' ...... /ccs/home/simonej/ks_rhmc/build_quda/_deps/eigen-src/Eigen/src/Core/arch/AltiVec/MatrixProductMMA.h:533:5: error: invalid conversion from type '* __vector_pair' 533 | MICRO_COMPLEX_MMA_ONE

weinbe2 commented 2 years ago

Not strictly an answer, but do you have a specific need for GCC 10.*? I believe I've seen that flavor of gcc have issues for "no good reason" before, and have empirically found 7.* and 9.* to be reliably safe.

weinbe2 commented 2 years ago

Not saying that the specific issue above is an error for "no good reason", I'm more implying that gcc 10 has a (personal) track record of being particularly difficult.

james-simone commented 2 years ago

OK, this report may just be informational. I happened upon this issue since gcc 10.2 is one of the later compilers on our ppc64 worker. I was able to reproduce the issue with the gcc 10.2 compiler on Summit.

weinbe2 commented 2 years ago

Digging into the Eigen repo, it looks like there were some known issues related to PowerPC (the top of that header queries the CPU arch).

Fixes were merged only ~3 weeks ago (even though Eigen 3.4.0 itself is ~6 months old): https://gitlab.com/libeigen/eigen/-/merge_requests/847

This diff from the commit looks particularly relevant:

- __vector_pair* a0 = (__vector_pair *)(&a.packet[0]);
+ __vector_pair* a0 = reinterpret_cast<__vector_pair *>(const_cast<Packet2d *>(&a.packet[0]));

Since the issue isn't happening with other versions of gcc on PowerPC, I recommend just using gcc 7 or 9 for now, and as soon as a new version of Eigen is released we'll update it. (If a specific need for gcc 10 does arise, though, let us know.)

weinbe2 commented 2 years ago

Re-opening it for visibility, will close once a new Eigen release has been merged in.

mathiaswagner commented 2 years ago

I would prefer to stick to released Eigen versions but we could add an option for using any git commit if there is a strong need. For now it seems we can wait on the next release.

bjoo commented 2 years ago

I fell afoul of this today. I got around it by downloading eigen-3.3.9 and unzipping it manually. Then setting

  -DQUDA_DOWNLOAD_EIGEN=OFF \
   -DEIGEN_INCLUDE_DIR=${SRCROOT}/eigen-3.3.9 \

on the Cmake command line. I am not sure why 3.4.0 is giving the problem. Working with GCC-10 because of the desire/need to use Concepts in QDP-JIT.

mathiaswagner commented 2 years ago

Is this related to #1256 ?

You should also be able to use something like -DQUDA_EIGEN_VERSION=3.3.9 instead of the more tedious download / unzip.