Author: Markus Höhnerbach hoehnerbach@aices.rwth-aachen.de Date: 4 Aug 2016
This project provides the source code of a vectorized implementation of the Tersoff potential. We target a variety of processors with conventional vector instruction sets such as NEON, SSE, AVX, and AVX2, the first and second generation of the Xeon Phi accelerator, as well as NVIDIA GPUs. There is experimental support for platform-agnostic vectorization through the Cilk array notation.
Supported compilers: ICC 14.0, 15.0 or 16.0, GCC (ARM) Supported MPI: Intel MPI
The code builts upon the existing Xeon Phi support and vectorization capabilities of the USER-INTEL LAMMPS package as well as the GPU support from the KOKKOS package.
benchmarks/
vect/
very simple benchmark to measure vect. efficiency.
lammps/
input files, parameter files and scripts to conduct
benchmarking and accuracy tests. Subfolders contain
results from real-world systems.
machines/
lammps-10Mar16/
complete lammps source code that is certain to work
with the provided source code.
-_
To try this code out, download LAMMPS from lammps.sandia.gov, and extract the files to some directory $LAMMPS_DIR. In the following, $THIS denotes the directory where this README is located. You need to enable the packages MANYBODY, USER-OMP and USER-INTEL:
$ cd $LAMMPS_DIR/src $ make yes-MANYBODY yes-USER-OMP yes-USER-INTEL
Copy the files pair_tersoff_intel.h, pair_tersoff_intel.cpp and intel_intrinsics.h from $THIS/src/ to $LAMMPS_DIR/src.
Build LAMMPS (make sure to have ICC with offloading support and Intel MPI loaded):
$ make intel_phi
This creates a binary $LAMMPS_DIR/src/lmp_intel_phi.
To test this binary, use the provided test-script:
$ cd $THIS/test $ python test.py $LAMMPS_DIR/src/lmp_intel_phi
All the tests should turn green.
For further usage instructions, please have a look at the documentation of the USER-INTEL package. The code neatly plugs into that framework, all you need to do is
If you just want to try out the code and make some obvservations on its performance, the easiest way to do so is to download the LAMMPS-provided benchmark for the Tersoff potential, and pass the correct options via the command line.
$ http://lammps.sandia.gov/bench/bench_tersoff.tar.gz $ tar xfz bench_tersoff.tar.gz $ cd tersoff $ $LAMMPS_DIR/src/lmp_intel_phi -in in.tersoff -pk omp 0 \ -pk intel 1 balance $BALANCE mode $MODE -sf intel
For in-depth benchmarking, build all the binaries that you would like to investigate (machines//build.sh show how to build a variety of targets). For single-node benchmarking, benchmarks/lammps contains shell scripts to conduct a number of experiments. For multi-node benchmarking, machines/lrz-ib_phi contains a python script to showcase how to create job-scripts to be submitted to a batch system. If you can't run the code on suitable machines, check out the result folders, i.e. benchmarks/lammps/results and machines/lrz-ib_phi/run*, as they contain real-world data from a selection of machines.
It inherits all the limitations inherent to the USER-INTEL package or the KOKKOS package, please look at that documentation for details.
There is a preprint describing this work on arXiv.org: https://arxiv.org/abs/1607.02904
The code is licensed in accordance with the LAMMPS copyright under the GNU General Public License Version 2 onwards. The vector math functions in vector_math_neon.h are copyrighted by Julien Pommier under the zlib license.