UCL-RITS / rcps-buildscripts

Scripts to automate package builds on RC Platforms
MIT License
39 stars 27 forks source link

Install Request: Open TELEMAC #312

Closed heatherkellyucl closed 2 years ago

heatherkellyucl commented 5 years ago

IN:03923979

Central install with all dependencies. http://wiki.opentelemac.org/doku.php?id=installation_on_linux

I have tested a more minimal version and can run the included gouttedo/t2d_gouttedo.cas example.

Config files for our clusters are at https://github.com/UCL-RITS/rcps-buildscripts/tree/master/files/telemac. mpi-submitter works and is currently the requested build type. mpi fails, apparently because it tries to create the same psm-shm directory twice during the job. The mpi-submitter version does the initial partitioning on the login node then submits a job, while the mpi version does both on the compute node and this doesn't work.

A job is run with a command like

telemac2d.py --ncsize=32 --walltime=0:15:0 ./gouttedo/t2d_gouttedo.cas

Currently the memory and TMPDIR size are set in the auto-generated jobscript to 2G and 10G respectively - there isn't an option in telemac2d.py to set them on the commandline.

That command will submit a script that looks like this:

#!/bin/bash -l
#$ -l h_rt=0:15:0
#$ -l mem=2G
#$ -l tmpfs=10G
#$ -N telemac2d
#$ -pe mpi 32
#$ -wd /scratch/scratch/cceahke/telemac/gouttedo/t2d_gouttedo.cas_2019-10-21-13h12min45s
module unload compilers mpi
module load compilers/gnu/4.9.2
module load mpi/openmpi/3.1.4/gnu-4.9.2
module load metis/5.1.0/gnu-4.9.2
module load hdf/5-1.8.15/gnu-4.9.2
module load openblas/0.2.14/gnu-4.9.2
module load python2/recommended
gerun /scratch/scratch/cceahke/telemac/gouttedo/t2d_gouttedo.cas_2019-10-21-12h31min11s/out_user_fortran
exit
heatherkellyucl commented 5 years ago

My test version runs without MUMPS and was built with the GNU compiler, OpenMPI, OpenBLAS, METIS, HDF5 and Python 2 - it appears that Python 3 should be fine from a list of dependencies sent by the devs, although the main site says:

To run TELEMAC-MASCARET the following software are mandatory:

  • Python 2.7.0 (installation_linux_python)
  • Numpy 1.8.3 (installation_linux_numpy)

These are Ubuntu directions on how to install all dependencies - use as a list.

# note this is info on how to install on Ubuntu. Confirmed working on the cis slave.

### as root
apt-get install openssh-server -y
systemctl enable ssh
systemctl start ssh
apt-get update
apt-get upgrade
apt-get install make cmake patch gfortran g++ libmpich-dev zlib1g-dev python3-dev python3-numpy python3-matplotlib python3-scipy subversion openjdk-8-jre-headless xvfb -y

# python3-pip probably not needed

##### python3-mayavi ??
##### maybe4 need numpy-devel or eqiv for f2py source

### as telemac

#HDF5
mkdir -p /home/telemac/builds/hdf5
cd /home/telemac/builds/hdf5
wget https://support.hdfgroup.org/ftp/HDF5/releases/hdf5-1.10/hdf5-1.10.5/src/hdf5-1.10.5.tar.gz
tar xf hdf5-1.10.5.tar.gz
cd /home/telemac/builds/hdf5/hdf5-1.10.5
./configure --prefix=/home/telemac/hdf5-1.10.5 --enable-fortran --enable-cxx
make -j8
make install

#MED
mkdir -p /home/telemac/builds/MED
cd /home/telemac/builds/MED
wget http://files.salome-platform.org/Salome/other/med-4.0.0.tar.gz
tar xf med-4.0.0.tar.gz
cd /home/telemac/builds/MED/med-4.0.0
# seems to be looking in hdf5.../lib instead of libs64
ln -s /home/telemac/hdf5-1.10.5/lib64/ /home/telemac/hdf5-1.10.5/lib
PYTHON=/usr/bin/python3 ./configure --with-hdf5=/home/telemac/hdf5-1.10.5 --prefix=/home/telemac/MED/ --enable-fortran --with-f90=gfortran --with-gnu-ld
make -j8
make install

#metis
mkdir -p /home/telemac/builds/metis
cd /home/telemac/builds/metis
wget http://glaros.dtc.umn.edu/gkhome/fetch/sw/metis/metis-5.0.2.tar.gz
tar xf metis-5.0.2.tar.gz
cd /home/telemac/builds/metis/metis-5.0.2/
make config
make -j8
mkdir -p /home/telemac/metis-5.0.2/
cp /home/telemac/builds/metis/metis-5.0.2/build/Linux-x86_64/libmetis/libmetis.a /home/telemac/metis-5.0.2/

mkdir -p /home/telemac/mumps/
cd /home/telemac/mumps/
wget http://www.netlib.org/blacs/mpiblacs.tgz
wget http://www.netlib.org/blacs/blacstester.tgz
wget http://www.netlib.org/blacs/mpiblacs-patch03.tgz
tar zxf blacstester.tgz
tar zxf mpiblacs-patch03.tgz
tar zxf mpiblacs.tgz
cd /home/telemac/mumps/BLACS/
cp BMAKES/Bmake.MPI-LINUX Bmake.inc
patch Bmake.inc < ~/ubuntu.patches/mumps/Bmake.inc.patch
make -j8 mpi

cd /home/telemac/mumps
wget "http://downloads.sourceforge.net/project/math-atlas/Stable/3.10.2/atlas3.10.2.tar.bz2?r=http%3A%2F%2Fsourceforge.net%2Fprojects%2Fmath-atlas%2Ffiles%2Flatest%2Fdownload%3Fsource%3Dfiles&ts=1429093679&use_mirror=heanet"
mv atlas3.10.2.tar.bz2\?r\=http\:%2F%2Fsourceforge.net%2Fprojects%2Fmath-atlas%2Ffiles%2Flatest%2Fdownload\?source\=files\&ts\=1429093679\&use_mirror\=heanet atlas3.10.2.tar.bz2
wget http://www.netlib.org/lapack/lapack-3.5.0.tgz
tar xf atlas3.10.2.tar.bz2
cd ATLAS/
mkdir build
cd build
../configure --prefix=/home/telemac/mumps/LAPACK/ --with-netlib-lapack-tarfile=/home/telemac/mumps/lapack-3.5.0.tgz --shared -b 64
make -j8 build

cd /home/telemac/mumps
wget http://www.netlib.org/blas/blas.tgz  # or wget http://www.netlib.org/blas/blas-3.8.0.tgz
tar xf blas.tgz
cd /home/telemac/mumps/BLAS-3.8.0
make all

cd /home/telemac/mumps
svn co https://icl.cs.utk.edu/svn/scalapack-dev/tags/scalapack-1.8.0 SCALAPACK
cd /home/telemac/mumps/SCALAPACK/
cp SLmake.inc.example SLmake.inc
patch SLmake.inc < ~/ubuntu.patches/mumps/SLmake.inc.patch
make #compile failed in parallel

cd /home/telemac/mumps
#register to download… http://mumps.enseeiht.fr/index.php?page=dwnld 
wget  http://mumps.enseeiht.fr/MUMPS_5.0.0.tar.gz
tar xf MUMPS_5.0.0.tar.gz
cd /home/telemac/mumps/MUMPS_5.0.0/
cp Make.inc/Makefile.INTEL.PAR Makefile.inc
patch Makefile.inc < ~/ubuntu.patches/mumps/MUMPS.Makefile.inc.patch
make -j6 all

mkdir -p /home/telemac/aed2
cd /home/telemac/aed2
svn co http://svn.opentelemac.org/svn/opentelemac/trunk/optionals/aed2/ .
make
heatherkellyucl commented 5 years ago

MED

MUMPS

aed2

heatherkellyucl commented 5 years ago

Need to build a serial GNU hdf/5-1.10.x for MED. I had been using hdf/5-1.8.15/gnu-4.9.2 for TELEMAC without MED.

heatherkellyucl commented 5 years ago

MED is now building.

heatherkellyucl commented 5 years ago

The MUMPS website http://mumps.enseeiht.fr is not loading.

heatherkellyucl commented 5 years ago

AED2 builds using gfortran, but only includes -Iinclude -I/usr/include - not sure if that will cause a problem.

heatherkellyucl commented 5 years ago

I think it probably makes sense to build aed2 as part of the telemac install and not as a separate module since it is from their svn without tags and there is nothing saying which version it is.

heatherkellyucl commented 5 years ago

MUMPS website working today, so requested download. (License is CeCILL-C, compatible with GPL).

heatherkellyucl commented 4 years ago

When I have time to do any more on this, it sounds like TELEMAC may need a newer GNU compiler in order to work correctly (the user install from IN:03923979 has been getting Program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation. at a reproducible point during a run and a newer compiler was suggested by the local team.

This would mean building an openmpi for compilers/gnu/9.2.0 and redoing at least some of the above with that compiler (something appears to be linking libgfortran.so.3 specifically in the user install rather than libgfortran.so, so using the old modules with the newer compiler isn't working).

On checking, HDF5 definitely does that so we will need one of those... Need to see if any of the others do.

ldd /shared/ucl/apps/hdf/5-1.10.5/gnu-4.9.2/lib/libhdf5_fortran.so.102.0.0
        linux-vdso.so.1 =>  (0x00007ffc611a7000)
        libhdf5.so.103 => /shared/ucl/apps/hdf/5-1.10.5/gnu-4.9.2/lib/libhdf5.so.103 (0x00007f3737a28000)
        libz.so.1 => /lib64/libz.so.1 (0x00007f3737800000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00007f37375fb000)
        libgfortran.so.3 => /shared/ucl/apps/gcc/4.9.2/lib/../lib64/libgfortran.so.3 (0x00007f37372dd000)
        libm.so.6 => /lib64/libm.so.6 (0x00007f3736fdb000)
        libquadmath.so.0 => /shared/ucl/apps/gcc/4.9.2/lib/../lib64/libquadmath.so.0 (0x00007f3736d9c000)
        libc.so.6 => /lib64/libc.so.6 (0x00007f37369d9000)
        libgcc_s.so.1 => /shared/ucl/apps/gcc/4.9.2/lib/../lib64/libgcc_s.so.1 (0x00007f37367c2000)
        /lib64/ld-linux-x86-64.so.2 (0x0000557e7e441000)
heatherkellyucl commented 4 years ago

From the above, these all link libgfortran.so.3

openblas/0.2.14/gnu-4.9.2
hdf/5-1.10.5/gnu-4.9.2
med/4.0.0/gnu-4.9.2
heatherkellyucl commented 4 years ago

Note: IN:04006714 also requests MUMPS.

heatherkellyucl commented 4 years ago

Building OpenMPI 3.1.5 for compilers/gnu/9.2.0.

Serial HDF5 1.10.5 for compilers/gnu/9.2.0.

Now inform IN:03923979 to test with his install

MED for compilers/gnu/9.2.0.

heatherkellyucl commented 4 years ago

Current MUMPS is 5.2.1, will build a version for the usual GNU compiler first once I know how it needs building.

ETA: this needs sequential MUMPS, only requirement METIS. But different from TELEMAC which appears to need parallel, but not that parallel (at least based on the instructions above which modify Makefile.INTEL.PAR).

This also needs the Intel compiler, not GNU.

heatherkellyucl commented 4 years ago

MUMPS can be built using SCOTCH and/or METIS, or PT-SCOTCH and/or ParMetis for the parallel version. TELEMAC appears to use a version with METIS only. Need to check what IN:04006714 needed. (Done)

heatherkellyucl commented 4 years ago

OpenBLAS for compilers/gnu/9.2.0

MUMPS for compilers/gnu/9.2.0 (requires OpenBLAS)

heatherkellyucl commented 4 years ago

The OpenBLAS previously in use (openblas/0.2.14/gnu-4.9.2) was the native-threads version. openblas/0.3.7-native-threads/gnu-9.2.0 is the equivalent.

heatherkellyucl commented 4 years ago

All the above things are now built - left are TELEMAC itself with aed2.

heatherkellyucl commented 4 years ago

List of all prereqs looks like this:

module unload compilers mpi
module load beta-modules
module load gcc-libs/9.2.0
module load compilers/gnu/9.2.0
module load mpi/openmpi/3.1.5/gnu-9.2.0
module load mumps/5.2.1/gnu-9.2.0
module load hdf/5-1.10.5/gnu-9.2.0
module load openblas/0.3.7-native-threads/gnu-9.2.0
module load python3/3.7
module load med/4.0.0/gnu-9.2.0