Proteus optimizes C/C++ code execution, including CUDA/HIP kernels, by applying runtime optimizations using Just-In-Time (JIT) compilation powered by LLVM.
The motivation is that JIT compilation exposes runtime information that is unavailable during ahead-of-time (AOT) compilation, for example, the exact runtime values of variables during program execution. Proteus uses runtime information to specialize generated code using JIT compilation can significantly speedup execution.
Proteus supports optimizing the execution of functions (or kernels for GPUs) using runtime specialization. Our cookie-cutter optimization through specialization at the moment is argument specialization. Proteus folds arguments of functions to runtime constants, which enables greater code optimization during JIT code generation by turbo-charging classical compiler optimizations, such as loop unrolling, control flow simplification, and constant propagation. For GPU kernels, Proteus additionally sets the kernel execution launch bounds dynamically, to optimize register allocation and GPU concurrency.
To inteface with Proteus, users annotate functions (or GPU kernels) for JIT
compilation using the generic attribute annotate
with the "jit" tag, to
specify a list of function arguments to specialize for.
For example:
__atribute__((annotate("jit", 1, 2)))
void daxpy(double A, int N, double *a, double *b)
{
for(int i=0; i<N; ++i)
a[i] = A*a[i] + b[i];
}
The attribute annotates the function daxpy
for JIT specialization folding the
function arguments 1 (A) and 2 (N) to runtime constants.
👉 Note the argument list is 1-indexed.
Proteus will create a new, unique specialization at runtime for each set of unique runtime values of the annotated function arguments. This means that the same function can have multiple, unique specializations corresponding to different runtime values. Proteus implements in-memory caching so it will compile once and re-use specializations to amortize the JIT compilation overhead. Also, Proteus implements a persistent cache which stores specializations on disk to make them re-usable across program runs.
Proteus supports host, HIP, and CUDA compilation using Clang/LLVM or compatible vendor variants. Proteus extracts the bitcode of annotated functions and instruments execution to read runtime values of arguments using a plugin LLVM pass. Runtime JIT compilation is implemented by the Proteus runtime library based on LLVM.
The project uses cmake
for building and depends on an LLVM installation
(upstream versions 16, 17 or AMD ROCm version 5.7.1 have been tested).
Check the top-level CMakeLists.txt
for the available build options.
The typical building process is:
mkdir -p build && cd build
cmake -DCMAKE_INSTALL_PREFIX=<install_path> ..
make install
The directory scripts
contains scripts to setup building on the tioga
and
lassen
machines at LLNL.
Run them at the top-level directory as source setup-<machine>.sh
to load
environment modules and create a build-<machine>
directory with a working
configuration for the machine.
⚠️ For lassen
, Proteus needs Clang CUDA compilation so we provide a script to
build LLVM in scripts/build-llvm-lassen.sh
to manually run before
scripts/setup-lassen.sh
.
Besides annotating the source code, users must modify the compilation of
their application to include the Proteus JIT plugin pass in the compilation
options and link with the JIT runtime library.
This is done by adding -fpass-plugin=<install_path>/lib/libProteusJitPass.so
to Clang compilation and extend linker flags to include the runtime library (preferrably
rpath-ed) as in -L <install_path>/lib -Wl,-rpath,<install_path>/lib -lproteusjit
.
An one-liner example is:
clang++ -fpass-plugin=<install_path>/lib/libiProteusJitPass.so -L <install_path>/lib -Wl,-rpath,<install_path>/lib -lproteusjit MyAwesomeCode.cpp -o MyAwesomeExe
🚧 A cmake export file for easy integration is in the works
We welcome contributions to Proteus in the form of pull requests targeting the
main
branch of the repo, as well as questions, feature requrests, or bug reports
via issues.
Please note that Proteus has a Code of Conduct. By participating in the Proteus community, you agree to abide by its rules.
Proteus was created by Giorgis Georgakoudis, georgakoudis1@llnl.gov.
Key contributors in code or design are David Beckingsale, beckingsale1@llnl.gov and Konstantinos Parasyris, parasyris1@llnl.gov.
Proteus is distributed under the terms of the Apache License (Version 2.0) with LLVM Exceptions.
All new contributions must be made under the Apache-2.0 with LLVM Exceptions license.
See LICENSE, COPYRIGHT, and NOTICE for details.
SPDX-License-Identifier: (Apache-2.0 WITH LLVM-exception)
LLNL-CODE-2000857