This project implements a gym environment where an AI receives memory accesses from a trace of real application execution and chooses ranges of addresses to migrate from/to a DDR (slow) memory to/from a HBM (fast) memory with a limited capacity. Every time the AI moves data around, it is penalized, as in a real application execution, and its goal is to carefully chose the data to place into the fast HBM memory so as to minimize the application execution time.
The environment loop is summarized in below image and works as follow:
memai.preprocessing
and memai.observation
for
implementation details of observations.[1]: Rapid Execution Time Estimation for Heterogeneous Memory Systems through Differential Tracing, ISC 22'
./memai/__init__.py
:
Base module of imports to instanciate a gym environment: Preprocessing
to import a preprocessed application trace, and GymEnv
to instanciate an
environment with the loaded trace.
./memai/action.py
:
The definition of the environment action space and implementation of actions
execution. See module documentation for implementation details.
./memai/observation.py
:
The definition of the environment observation space and implementation of
the translation from timestamped memory accesses into an observation.
See module documentation for implementation details.
./memai/estimator.py
:
This module contains the implementation of the execution time estimation
for a set of memory
access arbitrarily mapped in HBM from execution times in DDR and HBM for
these accesses. This is used to provide the AI with feedback on its
mapping choices.
The module can be executed with to evaluate execution time against real executions to check the accuracy of estimation.
python memai/estimator.py --help
./memai/memory.py
:This module contain the implementation of the memory piece of the gym environment. The memory module keeps track of allocations/free in the HBM and statistics on the memory access. It also provides the exact amount of pages that actually needed to be allocated/freed when the AI actions leads to such requests.
The memory capacity can be set to emulate situations where the application dataset cannot fit entirely in the fast HBM memory. Performance attributes of the memory cannot be changed since the simulation relies on real executions performance with real memory performance.
./memai/preprocessing.py
:Script to convert raw traces of memory access into gym input traces of observations and associated data for execution time estimation.
python memai/preprocessing.py --help
./memai/traces.py
:Module to read and iterate through raw traces of applications memory accesses. Traces loaded with this module are used as input in the estimation script or preprocessing script.
./memai/interval.py
:Module to incrementally build the set of contiguous address regions
accessed by an application. This is used by the preprocessing script to
split windows of memory accesses into windows x regions
of memory
accesses and build an observation for each element of this space.
./memai/env.py
:This is the module containing the gym environment implementation. This script can be executed to unroll a trace based environment with simple actions.
./memai/env.py --help
./memai/ai.py
:This is an example script containing an AI implementation that can be trained and evaluated on input traces.
./memai/ai.py --help
./memai/options.py
:Common options for executable scripts.
Raw data collected from applications run.
See memai.traces.Trace
module for details on the content of the traces.
./utils/convert.sh
./utils/estimator_and_plotter.py
./utils/pebsview.py
./utils/plot_accesses.py
./utils/plot_csv.py
./utils/plot_csv_all.py
./utils/preprocessing.sh
: preprocess (with default arguments) all the
traces in ./data
into traces of observation that can be used as input to
the gym environment.
./utils/preprocessing.sh -h
./utils/ai.sh
: Train and evaluate an AI with a preset training set,
test set and default environment arguments (e.g HBM memory size).
./utils/ai.sh -h
Some scripts can be directly executed as unit tests:
python -m unittest memai.action memai.memory memai.observation memai.interval
or
python memai/action.py
python memai/observation.py
python memai/memory.py
python memai/interval.py
Integration tests here are more of a visual assertion of expected range of values. They do not have a ground truth value to check against.
python memai/estimator.py --ddr-input <file1> --hbm-input <file2> --hbm-ranges 0x000000000000-0x000000000001
Should estimate execution time equal to the real DDR execution time.
python memai/estimator.py --ddr-input <file1> --hbm-input <file2> --hbm-ranges 0x000000000000-0xffffffffffff
Should estimate execution time roughly to the real HBM execution time.
python memai/estimator.py --ddr-input <file1> --hbm-input <file2> --measured-input <file3>
Should estimate execution time with a reasonable error compared to the measured input.
python memai/env.py --input <file> --action all_ddr
Should estimate execution time equal to the real DDR execution time with
no allocations in the HBM, no allocation overflow and a lot of double free.
python memai/env.py --input <file> --action all_hbm
Should estimate execution time roughly to the real HBM execution time with
allocations in the HBM matching the application dataset size and no frees.
python memai/env.py --input <file> --action all_hbm --hbm-size 256
Should match the allocated data size and hbm size (If the application
data set size is bigger than 256MiB).
python memai/ai.py train --input <file> --model-dir <save-directory>
Should run the whole loading bar and save the resulting model.
python memai/ai.py eval --input <file> --model-dir <save-directory>
Should evaluate the simulated execution time with the trained model on the
same trace (hopefully the AI is efficient).