StanfordLegion / legion

The Legion Parallel Programming System
https://legion.stanford.edu
Apache License 2.0
675 stars 145 forks source link

Legion

Legion is a parallel programming model for distributed, heterogeneous machines.

Branches

The Legion team uses this repository for active development, so please make sure you're using the right branch for your needs:

Overview

Legion is a programming model and runtime system designed for decoupling the specification of parallel algorithms from their mapping onto distributed heterogeneous architectures. Since running on the target class of machines requires distributing not just computation but data as well, Legion presents the abstraction of logical regions for describing the structure of program data in a machine independent way. Programmers specify the partitioning of logical regions into subregions, which provides a mechanism for communicating both the independence and locality of program data to the programming system. Since the programming system has knowledge of both the structure of tasks and data within the program, it can aid the programmer in host of problems that are commonly the burden of the programmer:

The Legion programming model is designed to abstract computations in a way that makes them portable across many different potential architectures. The challenge then is to make it easy to map the abstracted computation of the program onto actual architectures. At a high level, mapping a Legion program entails making two kinds of decisions:

  1. For each task: select a processor on which to run the task.
  2. For each logical region a task needs: select a memory in which to create a physical instance of the logical region for the task to use.

To facilitate this process Legion introduces a novel runtime 'mapping' interface. One of the NON-goals of the Legion project was to design a programming system that was magically capable of making intelligent mapping decisions. Instead the mapping interface provides a declarative mechanism for the programmer to communicate mapping decisions to the runtime system without having to actually write any code to perform the mapping (e.g. actually writing the code to perform a copy or synchronization). Furthermore, by making the mapping interface dynamic, it allows the programmer to make mapping decisions based on information that may only be available at runtime. This includes decisions based on:

All of this information is made available to the mapper via various mapper calls, some of which query the mapping interface while others simply are communicating information to the mapper.

One very important property of the mapping interface is that no mapping decisions are capable of impacting the correctness of the program. Consequently, all mapping decisions made are only performance decisions. Programmers can then easily tune a Legion application by modifying the mapping interface implementation without needing to be concerned with how their decisions impact correctness. Ultimately, this makes it possible in Legion to explore whole spaces of mapping choices (which tasks run on CPUs or GPUs, or where data gets placed in the memory hierarchy) simply by enumerating all the possible mapping decisions and trying them.

To make it easy to get a working program, Legion provides a default mapper implementation that uses heuristics to make mapping decisions. In general these decision are good, but they are certain to be sub-optimal across all applications and architectures. All calls in the mapping interface are C++ virtual functions that can be overridden, so programmers can extend the default mapper and only override the mapping calls that are impacting performance. Alternatively a program can implement the mapping interface entirely from scratch.

For more details on the Legion programming model and its current implementation we refer to you to our Supercomputing paper.

http://theory.stanford.edu/~aiken/publications/papers/sc12.pdf

Contents

This repository includes the following contents:

Dependencies

To get started with Legion, you'll need:

Installing

Legion is currently compiled with each application. To try a Legion application, just call make in the directory in question. The LG_RT_DIR variable is used to locate the Legion runtime directory. For example:

git clone https://github.com/StanfordLegion/legion.git
export LG_RT_DIR="$PWD/legion/runtime"
cd legion/examples/circuit
make
./circuit

Makefile Variables

The Legion Makefile includes several variables which influence the build. These may either be set in the environment (e.g. DEBUG=0 make) or at the top of each application's Makefile.

Build Flags

In addition to Makefile variables, compilation is influenced by a number of build flags. These flags may be added to variables in the environment (or again set inside the Makefile).

Command-Line Flags

Legion and Realm accept command-line arguments for various runtime parameters. Below are some of the more commonly used flags:

The default mapper also has several flags for controlling the default mapping. See default_mapper.cc for more details.

Developing Programs

To start a new Legion application, make a new directory and copy apps/Makefile.template into your directory under the name Makefile. Fill in the appropriate fields at the top of the Makefile with the filenames needed for your application.

Most Legion APIs are described in legion.h; a smaller number are described in the various header files in the runtime/realm directory. The default mapper is available in default_mapper.h.

Debugging

Legion has a number of tools to aid in debugging programs.

Extended Correctness Checks

Compile with DEBUG=1 PRIVILEGE_CHECKS=1 BOUNDS_CHECKS=1" make and rerun the application. This enables dynamic checks for privilege and out-of-bounds errors in the application. (These checks are not enabled by default because they are relatively expensive.) If the application runs without terminating with an error, then continue on to Legion Spy.

Legion Spy

Legion provides a task-level visualization tool called Legion Spy. This captures the logical and physical dependence graphs. These may help, for example, as a sanity check to ensure that the correct sequence of tasks is being launched (and the tasks have the correct dependencies). Legion Spy also has a self-checking mode which can validate the correctness of the runtime's logical and physical dependence algorithms.

To capture a trace, invoke the application with -lg:spy -logfile spy_%.log. (No special compile-time flags are required.) This will produce a log file per node. Call the post-processing script to render PDF files of the dependence graphs:

./app -lg:spy -logfile spy_%.log
$LG_RT_DIR/../tools/legion_spy.py -dez spy_*.log

To run Legion Spy's self-checking mode, Legion must be built with the flag USE_SPY=1. Following this, the application can be run again, and the script used to validate (or render) the trace.

DEBUG=1 USE_SPY=1 make
./app -lg:spy -logfile spy_%.log
$LG_RT_DIR/../tools/legion_spy.py -lpa spy_*.log
$LG_RT_DIR/../tools/legion_spy.py -dez spy_*.log

Profiling

Legion contains a task-level profiler. No special compile-time flags are required. However, it is recommended to build with DEBUG=0 make to avoid any undesired performance issues.

To profile an application, run with -lg:prof <N> where N is the number of nodes to be profiled. (N can be less than the total number of nodes---this profiles a subset of the nodes.) Use the -lg:prof_logfile <logfile> flag to save the output from each node to a separate file. The argument to the -lg:prof_logfile flag follows the same format as for -logfile, except that a % (to be replaced by the node number) is mandatory. Finally, pass the resulting log files to legion_prof.py.

DEBUG=0 make
./app -lg:prof <N> -lg:prof_logfile prof_%.gz
$LG_RT_DIR/../tools/legion_prof.py prof_*.gz

This will generate a subdirectory called legion_prof under the current directory, including a file named index.html. Open this file in a browser.

Other Features