Lightweight Distributed ML Experiments Management 🛠️

Coming up with the right hypothesis is hard - testing it should be easy.

ML researchers need to coordinate different types of experiments on separate remote resources. The Machine Learning Experiment (MLE)-Toolbox is designed to facilitate the workflow by providing a simple interface, standardized logging, many common ML experiment types (multi-seed/configurations, grid-searches and hyperparameter optimization pipelines). You can run experiments on your local machine, high-performance compute clusters (Slurm and Sun Grid Engine) as well as on cloud VMs (GCP). The results are archived (locally/GCS bucket) and can easily be retrieved or automatically summarized/reported.

What Does The `mle-toolbox` Provide? 🧑‍🔧

API for launching jobs on cluster/cloud computing platforms (Slurm, GridEngine, GCP).
Common machine learning research experiment setups:
- Launching and collecting multiple random seeds in parallel/batches or async.
- Hyperparameter searches: Random, Grid, SMBO, PBT, Nevergrad, etc.
- Pre- and post-processing pipelines for data preparation/result visualization.
Automated report generation for hyperparameter search experiments.
Storage/retrieval of results and database in Google Cloud Storage Bucket.
Resource monitoring with dashboard visualization.

The 4 Step `mle-toolbox` Cooking Recipe 🍲

Follow the instructions below to install the mle-toolbox and set up your credentials/configuration.
Learn more about the individual infrastructure subpackages with the dedicated tutorial.
Read the docs explaining the pillars of the toolbox & the experiment meta-configuration job .yaml files .
Check out the example workflows 📄 to get started.
Run your own experiment using the template files/project and mle run.

Installation ⏳

If you want to use the toolbox on your local machine follow the instructions locally. Otherwise do so on your respective cluster resource (Slurm/SGE). A PyPI installation is available via:

pip install mle-toolbox

If you want to get the most recent commit, please install directly from the repository:

pip install git+https://github.com/mle-infrastructure/mle-toolbox.git@main

The Core Toolbox Subcommands 🌱

You are now ready to dive deeper into the specifics of experiment configuration and can start running your first experiments from the cluster (or locally on your machine) with the following commands:

	Command	Description
🚀	`mle run`	Start up an experiment (multi-config/seeds, search).
🖥️	`mle monitor`	Monitor resource utilisation (`mle-monitor` wrapper).
📥	`mle retrieve`	Retrieve experiment result from GCS/cluster.
💌	`mle report`	Create an experiment report with figures.
⏳	`mle init`	Setup of credentials & toolbox settings.
🔄	`mle sync`	Extract all GCS-stored results to your local drive.
🗂	`mle project`	Initialize a new project by cloning `mle-project`.
📝	`mle protocol`	List a summary of the most recent experiments.

You can find more documentation for each subcommand here.

Examples 📄 & Notebook Walkthroughs 📓

	Job Types	Description
📄 Single-Objective	`multi-configs`, `hyperparameter-search`	Core experiment types.
📄 Multi-Objective	`hyperparameter-search`	Multi-objective tuning.
📄 Multi Bash	`multi-configs`	Bash-based jobs.
📄 Quadratic PBT	`hyperparameter-search`	PBT on toy quadratic surrogate.
📄 Hyperband	`hyperparameter-search`	Hyperband on toy polynomial problem.

	Description	Colab
📓 Getting Started	Get started with the toolbox.
📓 Subpackages	Get started with the toolbox subpackages.
📓 `MLExperiment`	Introduction to `MLExperiment` wrapper.
📓 Evaluation	Evaluation of gridsearch results.
📓 GIF Animations	Walk through a set of animation helpers.
📓 Testing	Perform hypothesis tests on logs.

Acknowledgements & Citing the MLE-Infrastructure ✏️

If you use parts the mle-toolbox in your research, please cite it as follows:

@software{mle_infrastructure2021github,
  author = {Robert Tjarko Lange},
  title = {{MLE-Infrastructure}: A Set of Lightweight Tools for Distributed Machine Learning Experimentation},
  url = {http://github.com/mle-infrastructure},
  year = {2021},
}

Development 👷

You can run the test suite via python -m pytest -vv tests/. If you find a bug or are missing your favourite feature, feel free to create an issue and/or start contributing 🤗.

mle-infrastructure / mle-toolbox

readme

Lightweight Distributed ML Experiments Management 🛠️

What Does The `mle-toolbox` Provide? 🧑‍🔧

The 4 Step `mle-toolbox` Cooking Recipe 🍲

Installation ⏳

The Core Toolbox Subcommands 🌱

Examples 📄 & Notebook Walkthroughs 📓

Acknowledgements & Citing the MLE-Infrastructure ✏️

Development 👷

mle-infrastructure / mle-toolbox

readme

Lightweight Distributed ML Experiments Management 🛠️

What Does The mle-toolbox Provide? 🧑‍🔧

The 4 Step mle-toolbox Cooking Recipe 🍲

Installation ⏳

The Core Toolbox Subcommands 🌱

Examples 📄 & Notebook Walkthroughs 📓

Acknowledgements & Citing the MLE-Infrastructure ✏️

Development 👷

What Does The `mle-toolbox` Provide? 🧑‍🔧

The 4 Step `mle-toolbox` Cooking Recipe 🍲