Setting up an MPI cluster on Upcloud with Cent OS 8: Part a
We describe how to setup a cluster that allows for
parallel computation on Upcloud. Similar material is available
in [1] and [2].
Log in to the Upcloud web interface and setup two virtual machines,
we shall assume each virtual machine has 1 Cpu, 1 Gb RAM and 10 Gb
hard disk space. If you expect to user MPI4PY, using 2Gb or RAM
will be helpful as the installaiton process will then be faster.
Once you have obtained your passwords, log in to the machines
ssh root@ip.address.machine1
ssh root@ip.address.machine2
On both machines, change the root password, and then create a user that can login
passwd
useradd paralleluser
passwd paralleluser
Add the user to the wheel group to have sudo rights
usermod -aG wheel paralleluser
Then disable root login
nano /etc/ssh/sshd_config
Change the line PermitRootLogin Yes to PermitRootLogin No.
Enable SELinux
nano /etc/selinux/config
Change the line SELINUX=permissive to SELINUX=enforcing. SELinux
ensures only process only access appropriate data. You can find out
more about SELinux in [3].
Then reboot the machines
reboot
Log back into the machines
ssh paralleluser@ip.address.machine1
and in a separate terminal
ssh paralleluser@ip.address.machine2
On both machines update the software then install compilers
and other base computing components
In cases where performance is critical and time allows, you are advised
to build the compilers yourself choosing appropriate options rather than
using the packaged compilers. This can allow you to explore newer compiler
versions, as well as alternative compilers such as Clang and Flang.
Then obtain OpenMPI and install it
wget https://download.open-mpi.org/release/open-mpi/v4.0/openmpi-4.0.5.tar.gz
tar -xvf openmpi-4.0.5.tar.gz
cd openmpi-4.0.5
mkdir build
cd build
../configure --enable-mpi-java --enable-mpi-fortran
make
sudo make install
cd ..
Enable passwordless ssh between the two machines by creating ssh keys
WITHOUT PASSWORDS and exchanging these keys between the machines. On
the first machine
ssh-keygen -t rsa
ssh-copy-id ip.address.machine2
On the second machine
ssh-keygen -t rsa
ssh-copy-id ip.address.machine1
To improve performance, it can be helpful to use a separate network
for commnication done using MPI. You will need to stop the machines
to do this, thus on both machines
exit
Within the Upcloud webinterface create an internal network for your
two virtual machines. Your machines should then get ip addresses
on the internal network. Once this internal ip addresses have been
attached, restart the virtual machines and login
ssh paralleluser@ip.address.machine1
and in a separate terminal
ssh paralleluser@ip.address.machine2
This will ensure that you athenticate the ECDSA key fingerprints.
The default configuration of CentOS 8 on Upcloud has a firewall running.
OpenMPI can use a large number of ports for communication, you therefore
need to put the ip addresses of communicating process in a trusted group
in the firewall configuration. Since your virtual machines are accessible
from the public internet, it is advisable to keep the firewall running.
If you expect your cluster to not be accessible from the public internet,
except perhaps through some gateway node, you can turn off your firewall.
Here, we will assume the firewall is left on.
Add the ip address on the internal network to the trusted firewall zone,
on the first machine
If you are not using an internal network, replace internal.ip.address.machine1 and
internal.ip.address.machine2 with the ip addresses you used to login,
ip.address.machine1 and ip.address.machine2
Check that passwordless ssh works on both machines, on the first machine, login to the
second machine and then exit from the second machine.
ssh internal.ip.address.machine2
exit
On the second machine, login to the first machine and then exit from the first machine
ssh internal.ip.address.machine1
exit
The next step is to run an example program using MPI. The MPI library needs to know what
machines it can use. This information is provided in a hostfile which you need to create
The hostfile is only needed on one machine from which you will run the MPI programs, but
to allow launching of MPI programs from either machine, it is helpful to also have the
hostfile on the second machine.
Then test that you can run a parallel program. As a first example obtain the hostname on
each of the machines. Launch the MPI from one of the machines using
mpirun -np 2 --hostfile ./hostfile hostname
where it is assumed that the hostfile is in your home directory.
It is also good to test that a compiled program will run. On each machine create a directory
called mpijava, get an example Hello World Java program, compile it and run it
If you expect to use programs written in Python, it can be helpful to install MPI4PY. You need
to do this on both machines. MPI4PY expects an executable called python, but CentOS 8 provides
only an executable called python3, so soft link this to python and then install MPI4PY
sudo ln -s /usr/bin/python3 /usr/bin/python
If you have less than 2Gb of RAM per core, you will need to create swap space for temporary
shared memory storage overflow when installing MPI4PY
In addition to the MPI4PY documentation[4], an introduction to parallel programming with Python can be
found in [5],[6] and [7]. As a first step check that Hello World works. On both machines
cd $HOME
mkdir python
cp hostfile python
cd python
wget https://people.sc.fsu.edu/~jburkardt/py_src/hello_mpi/hello_mpi.py
Your cluster is operational! You can try other parallel programming languages that use MPI such
as C and Fortran. You are also encouraged to look at other parallel programming languages such as
Co-Array Fortran, XcalableMP, PCJ, X10, UPC etc.
Setting up an MPI cluster on Upcloud with Cent OS 8: Part a
We describe how to setup a cluster that allows for parallel computation on Upcloud. Similar material is available in [1] and [2].
Log in to the Upcloud web interface and setup two virtual machines, we shall assume each virtual machine has 1 Cpu, 1 Gb RAM and 10 Gb hard disk space. If you expect to user MPI4PY, using 2Gb or RAM will be helpful as the installaiton process will then be faster.
Once you have obtained your passwords, log in to the machines
On both machines, change the root password, and then create a user that can login
Add the user to the wheel group to have sudo rights
Then disable root login
Change the line
PermitRootLogin Yes
toPermitRootLogin No
.Enable SELinux
Change the line
SELINUX=permissive
toSELINUX=enforcing
. SELinux ensures only process only access appropriate data. You can find out more about SELinux in [3].Then reboot the machines
Log back into the machines
and in a separate terminal
On both machines update the software then install compilers and other base computing components
In cases where performance is critical and time allows, you are advised to build the compilers yourself choosing appropriate options rather than using the packaged compilers. This can allow you to explore newer compiler versions, as well as alternative compilers such as Clang and Flang.
Then obtain OpenMPI and install it
Enable passwordless ssh between the two machines by creating ssh keys WITHOUT PASSWORDS and exchanging these keys between the machines. On the first machine
On the second machine
To improve performance, it can be helpful to use a separate network for commnication done using MPI. You will need to stop the machines to do this, thus on both machines
Within the Upcloud webinterface create an internal network for your two virtual machines. Your machines should then get ip addresses on the internal network. Once this internal ip addresses have been attached, restart the virtual machines and login
and in a separate terminal
This will ensure that you athenticate the ECDSA key fingerprints.
The default configuration of CentOS 8 on Upcloud has a firewall running. OpenMPI can use a large number of ports for communication, you therefore need to put the ip addresses of communicating process in a trusted group in the firewall configuration. Since your virtual machines are accessible from the public internet, it is advisable to keep the firewall running. If you expect your cluster to not be accessible from the public internet, except perhaps through some gateway node, you can turn off your firewall. Here, we will assume the firewall is left on.
Add the ip address on the internal network to the trusted firewall zone, on the first machine
The last command should indicate that internal.ip.address.machine2 is in the trusted zone.
On the second machine
If you are not using an internal network, replace internal.ip.address.machine1 and internal.ip.address.machine2 with the ip addresses you used to login, ip.address.machine1 and ip.address.machine2
Check that passwordless ssh works on both machines, on the first machine, login to the second machine and then exit from the second machine.
On the second machine, login to the first machine and then exit from the first machine
The next step is to run an example program using MPI. The MPI library needs to know what machines it can use. This information is provided in a hostfile which you need to create
And within this write
The hostfile is only needed on one machine from which you will run the MPI programs, but to allow launching of MPI programs from either machine, it is helpful to also have the hostfile on the second machine.
Then test that you can run a parallel program. As a first example obtain the hostname on each of the machines. Launch the MPI from one of the machines using
where it is assumed that the hostfile is in your home directory.
It is also good to test that a compiled program will run. On each machine create a directory called mpijava, get an example Hello World Java program, compile it and run it
If you expect to use programs written in Python, it can be helpful to install MPI4PY. You need to do this on both machines. MPI4PY expects an executable called python, but CentOS 8 provides only an executable called python3, so soft link this to python and then install MPI4PY
If you have less than 2Gb of RAM per core, you will need to create swap space for temporary shared memory storage overflow when installing MPI4PY
Finally install MPI4PY
In addition to the MPI4PY documentation[4], an introduction to parallel programming with Python can be found in [5],[6] and [7]. As a first step check that Hello World works. On both machines
On one of the machines, execute
Your cluster is operational! You can try other parallel programming languages that use MPI such as C and Fortran. You are also encouraged to look at other parallel programming languages such as Co-Array Fortran, XcalableMP, PCJ, X10, UPC etc.
References
[1] https://glennklockwood.blogspot.com/2013/04/quick-mpi-cluster-setup-on-amazon-ec2.html [2] https://mpitutorial.com/tutorials/running-an-mpi-cluster-within-a-lan/ [3] https://wiki.centos.org/HowTos/SELinux [4] https://mpi4py.readthedocs.io/en/stable/tutorial.html# [5] http://calculquebec.github.io/cq-formation-advanced-python/ul-20160216/index.html [6] https://deapsecure.gitlab.io/deapsecure-lesson06-par/10-distributed-memory-model/index.html