mpickpt / mana

MANA for MPI
36 stars 24 forks source link

Cannot build Mana on container CentOS 7.9 #290

Open antoinetran opened 1 year ago

antoinetran commented 1 year ago

Expected result: Building Mana on either main or latest tag should work.

Current result: It fails with this error

f951: Warning: command-line option '-std=gnu11' is valid for C/ObjC but not for Fortran
In file included from mpi_type_wrappers.cpp:29:
mpi_nextfunc.h:99:19: error: conflicting declaration of C function 'int MPI_Type_struct(int, int*, MPI_Aint*, MPI_Datatype*, MPI_Datatype*)'
   99 |   EXTERNC rettype MPI_##name(APPLY(PAIR, args))
      |                   ^~~~
mpi_type_wrappers.cpp:186:1: note: in expansion of macro 'USER_DEFINED_WRAPPER'
  186 | USER_DEFINED_WRAPPER(int, Type_struct, (int) count,
      | ^~~~~~~~~~~~~~~~~~~~
In file included from ../mpi_plugin.h:25,
                 from mpi_type_wrappers.cpp:22:
/usr/include/mpich-3.2-x86_64/mpi.h:986:5: note: previous declaration 'int MPI_Type_struct(int, const int*, const MPI_Aint*, const MPI_Datatype*, MPI_Datatype*)'
  986 | int MPI_Type_struct(int count, const int *array_of_blocklengths,
      |     ^~~~~~~~~~~~~~~
In file included from mpi_type_wrappers.cpp:29:

Steps to reproduce: Build this docker container:

FROM centos:7.9.2009 as builder
RUN yum update -y
RUN yum install centos-release-scl epel-release -y \
  && rpms="bzip2 cmake3 devtoolset-11-toolchain git python3-pip wget" \
  && yum install ${rpms} -y \
  && for rpm in ${rpms} ; do yum install "${rpm}" -y ; done

RUN ln -s /usr/bin/cmake3 /usr/bin/cmake \
  && sclCreateProxy() { \
  cmd="$1" \
  sclName="$2" \
  && echo '#!/bin/sh' >/usr/bin/"${cmd}" \
  && echo exec scl enable "${sclName}" -- "${cmd}" \"\$@\" >>/usr/bin/"${cmd}" \
  && chmod 775 /usr/bin/"${cmd}" \
  ; } \
  && sclCreateProxy make devtoolset-11 \
  && sclCreateProxy gcc devtoolset-11 \
  && sclCreateProxy g++ devtoolset-11

####
# MPICH
####

RUN rpms="mpich-3.2 mpich-3.2-devel libxml2-static zlib-static" \
  && yum install ${rpms} -y \
  && for rpm in ${rpms} ; do yum install "${rpm}" -y ; done && yum clean all

ENV PATH="${PATH}:/usr/lib64/mpich-3.2/bin"

RUN set -x && mkdir /tmp/mana && git clone https://github.com/mpickpt/mana.git -b nersc-release-phase-2-v2 \
  && cd ./mana && export MANA_ROOT=$PWD \
  && git submodule update --init && ./configure && make -j mana
antoinetran commented 1 year ago

Ok, following https://github.com/mpickpt/mana/blob/main/doc/mana-centos-tutorial.txt, I understood the issue is about mpich version. Mana needs Mpich >= 3.3.2 instead of provided mpich-3.2 (from CentOS 7). I still have issue but working on that:


cp -f lh_proxy lh_proxy_da gethostbyname-static/gethostbyname_static.o /tmp/mana/mana/bin/
cp: cannot stat 'lh_proxy_da': No such file or directory
make[3]: *** [Makefile:97: install] Error 1
make[2]: *** [Makefile:106: /tmp/mana/mana/bin/lh_proxy] Error 2
make[3]: Leaving directory '/tmp/mana/mana/mpi-proxy-split/lower-half'
jiamingz9925 commented 1 year ago

Hi, I wonder do you manage to make it run on CentOS 7? I followed the tutorial and encounter 'lmpi command not found error' on both nersc-release-phase-3 and most up-to-date head version.

antoinetran commented 1 year ago

Hi, I wonder do you manage to make it run on CentOS 7? I followed the tutorial and encounter 'lmpi command not found error' on both nersc-release-phase-3 and most up-to-date head version.

Hi I gave up trying to make Mana working on a cluster that is not NERSC a while ago, but I have still hope that one day, a Mana developer reads this issue and make the Mana build and run easier for other cluster!

I think I managed to build it, but not running it with my OpenMPI application. Here is how below. If you manage to build and run in CentOS 7, can you answer here? I will be glad to here it!

Dockerfile



####
# Build environment
####

FROM centos:7.9.2009 as builder
RUN yum update -y

# The loop does a second yum install to check that individual rpm that are missing in remote are counted as an error.
# We need latest cmake3 instead of cmake, and latest gcc*, thus devtoolset.
# See https://access.redhat.com/documentation/en-us/red_hat_developer_toolset/11/html/user_guide/chap-red_hat_developer_toolset#sect-Red_Hat_Developer_Toolset-Install
RUN yum install centos-release-scl epel-release -y \
  && rpms="bzip2 cmake3 devtoolset-11-toolchain git python3-pip wget" \
  && yum install ${rpms} -y \
  && for rpm in ${rpms} ; do yum install "${rpm}" -y ; done

RUN ln -s /usr/bin/cmake3 /usr/bin/cmake \
  && sclCreateProxy() { \
  cmd="$1" \
  sclName="$2" \
  && echo '#!/bin/sh' >/usr/bin/"${cmd}" \
  && echo exec scl enable "${sclName}" -- "${cmd}" \"\$@\" >>/usr/bin/"${cmd}" \
  && chmod 775 /usr/bin/"${cmd}" \
  ; } \
  && sclCreateProxy make devtoolset-11 \
  && sclCreateProxy gcc devtoolset-11 \
  && sclCreateProxy g++ devtoolset-11 \
  && sclCreateProxy gfortran devtoolset-11

ARG MPI_DIR
ENV MPI_DIR=${MPI_DIR:-/usr/local/lib/mpi}
ARG MANA_DIR
ENV MANA_DIR=${MANA_DIR:-/usr/local/lib/mana}
# See https://cmake.org/cmake/help/latest/module/FindGSL.html
ARG GSL_ROOT_DIR
ENV GSL_ROOT_DIR=${GSL_ROOT_DIR:-/usr/local/lib/gsl}

####
# MPICH
####

RUN rpms="libxml2-static libxml2-devel zlib-static xz-devel patch yum-utils" \
  && yum install ${rpms} -y \
  && for rpm in ${rpms} ; do yum install "${rpm}" -y ; done

RUN set -x && mkdir /tmp/xz-static && cd /tmp/xz-static \
  && wget http://vault.centos.org/7.9.2009/os/Source/SPackages/xz-5.2.2-1.el7.src.rpm \
  && mkdir liblzma_TEMP && cd liblzma_TEMP \
  && yumdownloader --source xz-devel \
  && rpm2cpio xz-*.src.rpm | cpio -idv \
  && find \
  && tar xf xz-*.tar.gz \
  && cd xz-5.2.2 \
  && patch -Np1 < ../xz-5.2.2-compat-libs.patch \
  && patch -Np1 < ../xz-5.2.2-man-page-day.patch \
  && ./configure --enable-static \
  && make -j install \
  && cp ./src/liblzma/.libs/liblzma.a /usr/lib64/

# To fix this error:
# configure: error: The Fortran compiler gfortran will not compile files that call the same routine with arguments of different types.
# see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91731
RUN mkdir /tmp/mpich && cd /tmp/mpich && wget https://www.mpich.org/static/downloads/3.3.2/mpich-3.3.2.tar.gz --no-check-certificate \
  && tar xf mpich-3.3.2.tar.gz \
  && cd mpich-3.3.2 \
  && export FFLAGS="-w -fallow-argument-mismatch -O2" \
  && ./configure --prefix=$MPI_DIR \
  && make -j install

ENV PATH="${PATH}:${MPI_DIR}/bin"

####
# MANA
####

RUN yum install glibc-static mlocate iproute vim -y

# Dirty workaround of Mana needing gzip >= 1.6 for "gzip --keep", but in CentOS7 we have gzip < 1.6.
RUN mv /usr/bin/gzip /usr/bin/gzip.orig \
  && writeGzip() { echo "$@" >>/usr/bin/gzip ; } \
  && writeGzip '#!/bin/sh' \
  && writeGzip 'i=0' \
  && writeGzip 'while test $((i+=1)) -lt "$#" ; do' \
  && writeGzip '  if test "$1" != "--keep" ; then' \
  && writeGzip '    set -- "$@" "$1"' \
  && writeGzip '    shift' \
  && writeGzip '  else' \
  && writeGzip '    set -- "$@"' \
  && writeGzip '    shift' \
  && writeGzip '  fi' \
  && writeGzip 'done' \
  && writeGzip 'exec gzip.orig "$@"' \
  && chmod 775 /usr/bin/gzip

RUN set -x && echo PATH = $PATH && type mpicc  && mpicc -v \
  && mkdir /tmp/mana && cd /tmp/mana && git clone https://github.com/mpickpt/mana.git -b main \
  && cd ./mana && export MANA_ROOT=$PWD \
  && git submodule update --init \
  && sed ./configure-mana -i -e 's,MANA_USE_LH_FIXED_ADDRESS=1$,MANA_USE_LH_FIXED_ADDRESS=1 "$@",g' \
  && sed mpi-proxy-split/lower-half/Makefile -i -e 's,2>&1,,g' -e 's,grep -q,grep,g' -e 's,if ${MPICC},set -x \&\& if ${MPICC},g' \
  && ./configure-mana --prefix="${MANA_DIR}" \
  && make mana \
  && make install