Closed PhilMiller closed 9 years ago
Original date: 2014-09-19 18:11:31
Anirudh asks via email:
How do I reproduce the build error in https://charm.cs.uiuc.edu/redmine/issues/549 ?
By MPI build are we referring to MPI code compiled with AMPI?
First, a quick note on our bug tracker usage: Please keep follow-up inquiries in Redmine, so that the problem-solving process can be as transparent as possible for other members of the lab, users reporting trouble, and anyone else interested in our development efforts. If something would involve private information, the 'Private notes' box can be checked.
The error in question can be reproduced by attempting to build the Charm++ RTS itself with the MPI network layer on an MPI implementation that exposes the (now deprecated) MPI C++ bindings. This is most often the case on MPICH and derivatives, such as Intel MPI. An example build command would look like
./build charm++ mpi-linux-x86_64
I believe we keep OpenMPI installed on the lab workstations. This would perhaps be easiest to reproduce and work out on a system with Intel's MPI installed, perhaps Taub/Golub or Stampede. You'll need to configure your environment with the right modules, according to their documentation (ask for help with this if needed).
The desired outcome of this issue is that we have an added check in our configure
script that, when building for an MPI network layer, tests whether the MPI implementation exhibits this issue, and adds the flags
-DMPICH_IGNORE_CXX_SEEK -DMPICH_SKIP_MPICXX
accordingly.
Original author: Anonymous Original date: 2014-10-03 15:57:55
Unable to reproduce the issue. Taub has two mpi versions(mvapich2 and openmpi). I tried it on hopper. The compilation went through(PrgEnv-intel and PrgEnv-gnu). I'm not sure about the mpi version hopper has. I don't find any information in the manuals.
I don't have an account on stampede. I was thinking of installing mpich on my workstation locally and see if the error appears.
Original author: Anonymous Original date: 2014-10-03 19:53:05
Still unable to reproduce the issue.
build mpich using intel compilers on lab workstation finesse:
./configure --prefix=/dcsdata/home/anirudh/MPI/mpich/mpich-3.1.2/install CC=icc CXX=icpc --disable-fortran 2>&1 | tee c.txt make 2>&1 | tee m.txt make install 2>&1 | tee mi.txt
Added bin to PATH and lib to LD_LIBRARY_PATH Build charm with the below commands. Both completed successfully
MPICXX=/dcsdata/home/anirudh/MPI/mpich/mpich-3.1.2/install/bin/mpicxx CXX=icpc ./build charm++ mpi-linux-x86_64 -j8 MPICXX=/dcsdata/home/anirudh/MPI/mpich/mpich-3.1.2/install/bin/mpicxx CXX=icpc ./build charm++ mpi-linux-x86_64 --basedir=/dcsdata/home/anirudh/MPI/mpich/mpich-3.1.2/install -j8
Original author: Anonymous Original date: 2014-10-10 18:05:41
Reproduced the problem on stampede with Nikhil's help.
The issue is fixed by defining MPICH_IGNORE_CXX_SEEK (MPICH_SKIP_MPICXX not used) in the conv-common.h.
Should we change the configure script or can we just leave it at this?
Original date: 2014-10-20 22:24:48
I would lean toward changing the configure script, because that avoids the risk of negatively impacting builds that wouldn't have encountered this issue.
I think the test code would be something like this, compiled as C++:
#include <stdio.h>
#include <mpi.h>
int main (int argc, char** argv) {
return 0;
}
Original date: 2014-11-11 22:12:02
Please push this fix for review in gerrit. I.e.
git push origin HEAD:refs/for/charm
Original issue: https://charm.cs.illinois.edu/redmine/issues/549