radical-experiments / AIMES-Experience

Experiments for the AIMES practice paper
MIT License
0 stars 0 forks source link


Measuring scatter of task executions across diverse distributions of resources.

Related paper at: https://bitbucket.org/shantenujha/aimes

Experimental Workflow

  1. Prerequisites: Python 2.7; pip; git; radical-pilot

  2. Clone this repository:

    git clone https://github.com/radical-experiments/AIMES-Experience.git
  3. Install RADICAL Cybertools:

    virtualenv ~/ve/aimes-experience
    . ~/ve/aimes-experience/bin/activate
    git clone git@github.com:radical-cybertools/radical.pilot.git
    cd radical.pilot; git checkout experiment/aimes; git pull; pip install --upgrade . ; cd ..
    git clone git@github.com:radical-cybertools/radical.utils.git
    cd radical.utils; git checkout experiment/aimes; git pull; pip install --upgrade . ; cd ..
    git clone git@github.com:radical-cybertools/saga-python.git
    cd saga-python; git checkout experiment/aimes; git pull; pip install --upgrade . ; cd ..
  4. Move into the AIMES-Experience directory.

  5. Edit the file experiment.py setting the following global variables to their appropriate value:

    N_UNITS = 2048
    U_CORES = 1
    U_TIME = 15
    RESOURCE = 'xsede.comet'
    N_PILOTS = 4
    P_CORES = 512
    P_WALLTIME = 75

    Note: SSH key-based, passwordless access to the choosen resource(s) is required.

  6. Set up your execution environment:

    . setup.sh
  7. run the experiment:

    python experiment.py
  8. Download session for the experiment:

    radicalpilot-close-session -m export -d mongodb:// -s rp.session.xxxx.xxxx.xxxx.xxxx.xxxx
  9. Upon success of the previous command, create a directory runn where n uniquely and incrementally indicates the number of the experiment.

  10. Create a file inside runn called metadata.json with the following information:

    "n_tasks": <int>,
    "n_cores": <int>,
    "pilots": [
      [<int n_cores>, <walltime>],
    "resources": [
    "cores": [
      [<int tasks>, <int n_cores>],
    "durations": [
      [<int tasks>, <int duration>],


    "n_tasks": 2048,
    "n_cores": 2048,
    "pilots": [
      [512, 75],
      [512, 75],
      [512, 75],
      [512, 75]
    "resources": [
    "cores": [
      [2048, 1]
    "durations": [
      [2048, 15]


    • Durations are in minutes.
    • "cores" and "durations" are used to describe partions of the set of tasks. At the moment, we use just 1 core and 15 minutes duration for each task but we will have to use more complex distributions or cores and durations.
  11. Copy the .prof, .json, and log file into the runn directory:

    cp rp.session.xxxx.xxxx.xxxx.xxxx.xxxx.prof rp.session.xxxx.xxxx.xxxx.xxxx.xxxx.json logs/radical_debug.log runn/
  12. Pull and push the repository.

Data Analysis Workflow