StreamHPC / gromacs

OpenCL porting of the GROMACS molecular simulation toolkit
http://www.gromacs.org
Other
25 stars 4 forks source link

This repository is out of date. You can find the latest version of GROMACS with OpenCL support here: http://www.gromacs.org/Downloads Check the manual for how to build GROMACS with OpenCL support enabled: http://manual.gromacs.org/documentation/5.1/install-guide/index.html#opencl-gpu-acceleration

This is a modified version of GROMACS 5.0 adding OpenCL support. All CUDA accelerated features have been ported to OpenCL 1.1.

Official GROMACS website: http://www.gromacs.org/

TABLE OF CONTENTS

  1. SUPPORTED OPENCL DEVICES

  2. GENERAL PROJECT SETUP

  3. OPENCL SETUP

  4. KNOWN LIMITATIONS

  5. TESTED CONFIGURATIONS

  6. SUPPORTED OPENCL DEVICES

    The current version works with NVIDIA GPUs and GCN based AMD GPUs. Make sure that you have the latest drivers installed. Also check "Known Limitations" chapter.

  7. GENERAL PROJECT SETUP

    Check GROMACS website for how to build the project: http://www.gromacs.org/Documentation/Installation_Instructions You can find additional information here: https://github.com/StreamComputing/gromacs/wiki/GROMACS-GPU-ACCELERATION-USING-OPENCL#General_Project_Setup

  8. OPENCL SETUP

    Build Gromacs with OpenCL support enabled

    To build Gromacs with OpenCL support enabled, an OpenCL SDK must be installed and the following cmake flags must be set: GMX_GPU GMX_USE_OPENCL After setting these flags, if an OpenCL implementation has been detected on your system, the following cmake entries will be defined: OPENCL_INCLUDE_DIR - the OpenCL include directory OPENCL_LIBRARY - the OpenCL library directory The two paths will be automatically used to configure the project.

Run Gromacs with OpenCL accelerations enabled

Gromacs loads and builds at runtime the OpenCL kernels. To do so, it needs to know the location of the OpenCL source files. If you want to run the installed version, the path to the OpenCL files is automatically defined. If you do not wish to install Gromacs, but run the version built from sources, you need to provide the path to the source tree with the OpenCL kernels like below: export GMX_OCL_FILE_PATH=/path-to-gromacs/src/

Caching options for building the kernels

Building an OpenCL program can take a significant amount of time. NVIDIA implements a mechanism to cache the result of the build. As a consequence, only the first build of the OpenCL kernels will take longer, the following builds will be very fast. AMD drivers, on the other hand, implement no caching and building a program can be very slow. That's why we have started implementing our own caching. Caching for OpenCL kernel builds is by default enabled. To disable it, set GMX_OCL_NOGENCACHE environment variable.

If you plan to modify the OpenCL kernels, you should disable any caching:

OpenCL Device Selection

The same option used to select CUDA devices or to define a mapping of GPUs to PP ranks can also be used for OpenCL devices: -gpu_id

Environment Variables For OpenCL

Currently, several environment variables exist that help customize some aspects of the OpenCL version of Gromacs. They are mostly related to the runtime compilation of OpenCL kernels, but they are also used on the device selection.

GXM_OCL_FILE_PATH: Is the full path to Gromacs src folder. Useful when gmx is called from a folder other than the installation/bin folder.

GMX_OCL_NOGENCACHE: Disable caching for OpenCL kernel builds.

GMX_OCL_NOFASTGEN: Generates and compiles all algorithm flavors, otherwise only the flavor required for the simulation is generated and compiled.

GMX_OCL_FASTMATH: Adds the option cl-fast-relaxed-math to the compiler options (in the CUDA version this is enabled by default, it is likely that the same will happen with the OpenCL version soon)

GMX_OCL_DUMP_LOG: If defined, the OpenCL build log is always written to file. The file is saved in the current directory with the name OpenCL_kernel_file_name.build_status where OpenCL_kernel_file_name is the name of the file containing the OpenCL source code (usually nbnxn_ocl_kernels.cl) and build_status can be either SUCCEEDED or FAILED. If this environment variable is not defined, the default behavior is the following:

  1. KNOWN LIMITATIONS

    • OpenCL on Intel CPUs is not supported
    • OpenCL on Intel GPUs is not supported
    • The current implementation is not compatible with OpenCL devices that are not using warp/wavefronts or for which the warp/wavefront size is not a multiple of 32
    • The following kernels are known to produce incorrect results: nbnxn_kernel_ElecEwQSTab_VdwLJ_VF_prune_opencl nbnxn_kernel_ElecEwQSTab_VdwLJ_F_prune_opencl
    • Due to blocking behavior of clEnqueue functions in NVIDIA driver, there is almost no performance gain when using NVIDIA GPUs. A bug report has already been filled on about this issue. A possible workaround would be to have a separate thread for issuing GPU commands. However this hasn't been implemented yet.
  2. TESTED CONFIGURATIONS

    Tested devices: NVIDIA GPUs: GeForce GTX 660M, GeForce GTX 750Ti, GeForce GTX 780 AMD GPUs: FirePro W5100, HD 7950, FirePro W9100, Radeon R7 M260, R9 290

Tested kernels: Kernel |Benchmark test |Remarks

nbnxn_kernel_ElecCut_VdwLJ_VF_prune_opencl |d.poly-ch2 | nbnxn_kernel_ElecCut_VdwLJ_F_opencl |d.poly-ch2 | nbnxn_kernel_ElecCut_VdwLJ_F_prune_opencl |d.poly-ch2 | nbnxn_kernel_ElecCut_VdwLJ_VF_opencl |d.poly-ch2 | nbnxn_kernel_ElecRF_VdwLJ_VF_prune_opencl |adh_cubic with rf_verlet.mdp | nbnxn_kernel_ElecRF_VdwLJ_F_opencl |adh_cubic with rf_verlet.mdp | nbnxn_kernel_ElecRF_VdwLJ_F_prune_opencl |adh_cubic with rf_verlet.mdp | nbnxn_kernel_ElecEwQSTab_VdwLJ_VF_prune_opencl |adh_cubic_vsites with pme_verlet_vsites.mdp |Failed nbnxn_kernel_ElecEwQSTab_VdwLJ_F_prune_opencl |adh_cubic_vsites with pme_verlet_vsites.mdp |Failed nbnxn_kernel_ElecEw_VdwLJ_VF_prune_opencl |adh_cubic_vsites with pme_verlet_vsites.mdp | nbnxn_kernel_ElecEw_VdwLJ_F_opencl |adh_cubic_vsites with pme_verlet_vsites.mdp | nbnxn_kernel_ElecEw_VdwLJ_F_prune_opencl |adh_cubic_vsites with pme_verlet_vsites.mdp | nbnxn_kernel_ElecEwTwinCut_VdwLJ_F_prune_opencl |adh_cubic_vsites with pme_verlet_vsites.mdp | nbnxn_kernel_ElecEwTwinCut_VdwLJ_F_opencl |adh_cubic_vsites with pme_verlet_vsites.mdp |

Input data used for testing - Benchmark data sets available here: ftp://ftp.gromacs.org/pub/benchmarks

                           * * * * *

The GROMACS OpenCL team StreamComputing - www.streamcomputing.eu Vincent Hindriksen - vincent@streamcomputing.eu Teemu Virolainen - teemu@streamcomputing.eu Anca Hamuraru - anca@streamcomputing.eu Contributors Dimitrios Karkoulis - dimitris.karkoulis@gmail.com


GROMACS is free software, distributed under the GNU Lesser General Public License, version 2.1 However, scientific software is a little special compared to most other programs. Both you, we, and all other GROMACS users depend on the quality of the code, and when we find bugs (every piece of software has them) it is crucial that we can correct it and say that it was fixed in version X of the file or package release. For the same reason, it is important that you can reproduce other people's result from a certain GROMACS version.

The easiest way to avoid this kind of problems is to get your modifications included in the main distribution. We'll be happy to consider any decent code. If it's a separate program it can probably be included in the contrib directory straight away (not supported by us), but for major changes in the main code we appreciate if you first test that it works with (and without) MPI, threads, double precision, etc.

If you still want to distribute a modified version or use part of GROMACS in your own program, remember that the entire project must be licensed according to the requirements of the LGPL v2.1 license under which you received this copy of GROMACS. We request that it must clearly be labeled as derived work. It should not use the name "official GROMACS", and make sure support questions are directed to you instead of the GROMACS developers. Sorry for the hard wording, but it is meant to protect YOUR reseach results!

                           * * * * *