JADBio / SHSR

Repository for the "A Meta-Level Learning Algorithm for Sequential Hyper-Parameter Space Reduction in AutoML" paper
Creative Commons Zero v1.0 Universal
0 stars 0 forks source link

A Meta-Level Learning Algorithm for Sequential Hyper-Parameter Space Reduction in AutoML

This repository contains all data and code required to produce the results in the submission with title "A Meta-Level Learning Algorithm for Sequential Hyper-Parameter Space Reduction in AutoML".

Data Description

Note on datasets and analyses: The algorithm in the paper takes as input performance and execution times of past runs (i.e., ML_results_{classification,regression}.csv). Providing all datasets and code to analyze them is out of scope.

All required data to produce the results for the paper are in data/data.zip. A list of files along with a description follows.

For the sake of convenience, all intermediate results produced by the scripts in this project are also provided in results/results.zip.

Note on Regression Problems

To increase the number of regression problems, classification problems were obtained from BioDataome and turned into regression problems as follows:

  1. JADBio was executed on each classification problem with default parameters and feature selection enforced, to find the most predictive features.
  2. The first returned feature was used as the outcome (all datasets contain only continuous variables), while all remaining ones For the sake of convenience, all intermediate results produce by the scripts in this project are also provided in results/results.zip. were used as predictors.

These datasets can be obtained by selecting all regression datasets from dataset_sources.csv from BioDataome.

Instructions

Note on requirements.txt: The code has been tested on the package versions in requirements.txt and might not run with other versions. We recommend using virtual environments to install dependencies.

Produce results required for plots

First, unzip data/data.zip files and add them to the data folder. Next, run the following scripts to produce all results required for the plots:

All results are stored in the results folder. Alternatively, this step can be skipped by unziping the results results/results.zip and adding them to the results folder.

Produce plots

Run the plots.py script to produce all plots of the paper. The plots are stored in the plots folder.