UCL-ARC / hpc-spack

Solutions - HPC's Spack config
MIT License
1 stars 2 forks source link

Does Spack create a working OpenMPI 4.1.x? YES! #1

Open heatherkellyucl opened 1 year ago

heatherkellyucl commented 1 year ago

Check multi-node on Young/Kathleen/Michael in particular. (We've had one working on Myriad and Thomas only).

heatherkellyucl commented 1 year ago

Related issues (because they depend on OpenMPI 4.1.x):

heatherkellyucl commented 1 year ago

As part of my HOOMD-blue investigating, I have ended up with a openmpi-4.1.4-gcc-4.9.2-j5watuf module in my Spack install on Young.

spack find -ldf

# the openmpi part:

j5watuf openmpi@4.1.4%gcc                                                                                                                                 
d2qmyox     hwloc@2.8.0%gcc                                                                                                                               
5lmv3ne         libpciaccess@0.16%gcc                                                                                                                     
6ztmz4g             util-macros@1.19.3%gcc                                                                                                                
rg74oss         libxml2@2.10.3%gcc                                                                                                                        
tvlwknx             libiconv@1.16%gcc                                                                                                                     
zqtvwgj             xz@5.2.7%gcc                                                                                                                          
bwqnwsu         ncurses@6.3%gcc                                                                                                                           
vybwgro     numactl@2.0.14%gcc                                                                                                                            
mhvtfkz         autoconf@2.69%gcc                                                                                                                         
7tai33o         automake@1.16.5%gcc                                                                                                                       
s46real         libtool@2.4.7%gcc                                                                                                                         
pc7f4sq         m4@1.4.19%gcc                                                                                                                             
vevpgoz             diffutils@3.8%gcc                                                                                                                     
oid6oac             libsigsegv@2.13%gcc                                                                                                                   
tnuhpj4     openssh@9.1p1%gcc                                                                                                                             
q3nmy4f         krb5@1.20.1%gcc                                                                                                                           
3jole6e             bison@3.8.2%gcc                                                                                                                       
rq32ugc             gettext@0.21.1%gcc                                                                                                                    
n5p4sxj                 tar@1.34%gcc                                                                                                                      
3zlhv77                     pigz@2.7%gcc                                                                                                                  
sifpfu7                     zstd@1.5.2%gcc                                                                                                                
7gtuxrg         libedit@3.1-20210216%gcc                                                                                                                  
pege64j         libxcrypt@4.4.31%gcc                                                                                                                      
iaefdyl         openssl@1.1.1s%gcc                                                                                                                        
bulswgh             ca-certificates-mozilla@2022-10-11%gcc                                                                                                
lksmiyk     perl@5.36.0%gcc                                                                                                                               
txaxkab         berkeley-db@18.1.40%gcc                                                                                                                   
i7forfu         bzip2@1.0.8%gcc                                                                                                                           
ikjdrtq         gdbm@1.23%gcc                                                                                                                             
g7ybkny             readline@8.1.2%gcc                                                                                                                    
bybst4r     pkgconf@1.8.0%gcc                                                                                                                             
2aqjdr4     pmix@4.1.2%gcc                                                                                                                                
6yztqjc         libevent@2.1.12%gcc                                                                                                                       
bwxsq6s     zlib@1.2.13%gcc 

I am testing a 2-node job on Young with c_mpi_pi from the pi_examples repo.

#!/bin/bash -l

#$ -l h_rt=0:10:0
#$ -l mem=1G
#$ -pe mpi 80

#$ -N pi_80_ompi-4.1.4
#$ -cwd
#$ -P Test
#$ -A Test_allocation

module unload -f compilers mpi
module load compilers/gnu/4.9.2

module use /home/cceahke/Scratch/spack/spack/share/spack/modules/linux-rhel7-broadwell
module load openmpi-4.1.4-gcc-4.9.2-j5watuf

ompi_info

gerun ./mpi_pi

(Probably do not need the compiler module loaded there at all).

heatherkellyucl commented 1 year ago

I forgot to export GERUN_LAUNCHER=openmpi-sge so gerun decided I had no MPI implementation. (Also, might need to set =openmpi if it doesn't have SGE integration).

heatherkellyucl commented 1 year ago

Ok, it was built with --without-sge by default so it only ran on one node.

heatherkellyucl commented 1 year ago

This time I set export GERUN_LAUNCHER=openmpi and it worked!

GERun: GErun command being run:
GERun:  mpirun -machinefile /tmpdir/job/820066.undefined/machines -np 80 ./mpi_pi 
Calculating PI using 80 processes...
Proc 18 says hello, is going to calculate slice 225000000-237499999
Proc 9 says hello, is going to calculate slice 112500000-124999999
Proc 34 says hello, is going to calculate slice 425000000-437499999
Proc 2 says hello, is going to calculate slice 25000000-37499999
Proc 57 says hello, is going to calculate slice 712500000-724999999
Proc 25 says hello, is going to calculate slice 312500000-324999999
Proc 58 says hello, is going to calculate slice 725000000-737499999
Proc 11 says hello, is going to calculate slice 137500000-149999999
Proc 0 says hello, is going to calculate slice 0-12499999
...
Proc 23 says hello, is going to calculate slice 287500000-299999999
Proc 39 says hello, is going to calculate slice 487500000-499999999
Proc 55 says hello, is going to calculate slice 687500000-699999999
The value of PI is 3.14159240526447
The time to calculate PI was 0.0770059 seconds
heatherkellyucl commented 1 year ago

I suppose one question is whether we should include SGE integration if we are replacing the scheduler. Gerun will look the same either way as long as we set GERUN_LAUNCHER appropriately, the difference is when using mpirun people will need to specify $TMPDIR/machines as their machinefile, whereas with SGE integration they don't need to include one.

Or whether we rebuild things at that time with $OtherScheduler integration. (Or can we add both and all will be well?)

balston commented 1 year ago

To build OpenMPI with my GCC 12.x build after adding the compiler to Spack I'm running:

spack install openmpi %gcc@12.2.0 2>&1 | tee OpenMPI-build.log
balston commented 1 year ago

I've now got the following installed on Young:

-- linux-rhel7-cascadelake / gcc@12.2.0 -------------------------
autoconf@2.69                       libevent@2.1.12    openssl@1.1.1s
automake@1.16.5                     libiconv@1.16      perl@5.36.0
berkeley-db@18.1.40                 libpciaccess@0.16  pigz@2.7
bison@3.8.2                         libsigsegv@2.13    pkgconf@1.8.0
bzip2@1.0.8                         libtool@2.4.7      pmix@4.1.2
ca-certificates-mozilla@2022-10-11  libxcrypt@4.4.33   readline@8.1.2
diffutils@3.8                       libxml2@2.10.3     tar@1.34
gdbm@1.23                           m4@1.4.19          util-macros@1.19.3
gettext@0.21.1                      ncurses@6.3        xz@5.2.7
hwloc@2.8.0                         numactl@2.0.14     zlib@1.2.13
krb5@1.20.1                         openmpi@4.1.4      zstd@1.5.2
libedit@3.1-20210216                openssh@9.1p1

and:

module avail openmpi
- /lustre/scratch/ccaabaa/apps/spack-test/spack/share/spack/modules/linux-rhel7-cascadelake -
openmpi-4.1.4-gcc-12.2.0-irwlhs3
balston commented 1 year ago

OpenMPI build on Myriad failed in the middle - need to check tomorrow.