Sequential Halving and Classification
AlgorithmPySHAC is a python library to use the Sequential Halving and Classification algorithm from the paper Parallel Architecture and Hyperparameter Search via Successive Halving and Classification with ease.
Note : This library is not affiliated with Google.
Stable build documentation can be found at PySHAC Documentation.
It contains a User Guide, as well as explanation of the different engines that can be used with PySHAC.
Topic | Link |
---|---|
Installation | http://titu1994.github.io/pyshac/install/ |
User Guide | http://titu1994.github.io/pyshac/guide/ |
Managed Engines | http://titu1994.github.io/pyshac/managed/ |
Custom Hyper Parameters | http://titu1994.github.io/pyshac/custom-hyper-parameters/ |
Serial Evaluation | http://titu1994.github.io/pyshac/serial-execution/ |
External Dataset Training | http://titu1994.github.io/pyshac/external-dataset-training/ |
Callbacks | http://titu1994.github.io/pyshac/callbacks/ |
This library is available for Python 2.7 and 3.4+ via pip for Windows, MacOSX and Linux.
pip install pyshac
To install the master branch of this library :
git clone https://github.com/titu1994/pyshac.git
cd pyshac
pip install .
or pip install .[tests] # to also include dependencies necessary for testing
To install the requirements before installing the library :
pip install -r "requirements.txt"
To build the docs, additional packages must be installed :
pip install -r "doc_requirements.txt"
There are also 3 additional hyper parameters, which are useful when a parameter needs to be sampled multiple times for each evaluation :
These multi parameters have an additional argument sample_count
which can be used to sample multiple times
per step.
Note: The values will be concatenated linearly, so each multi parameter will have a list of values
returned in the resultant OrderedDict. If you wish to flatten the entire search space, you can
use pyshac.flatten_parameters
on this OrderedDict.
import pyshac
# Discrete parameters
dice_rolls = pyshac.DiscreteHyperParameter('dice', values=[1, 2, 3, 4, 5, 6])
coin_flip = pyshac.DiscreteHyperParameter('coin', values=[0, 1])
# Continuous Parameters
classifier_threshold = pyshac.UniformContinuousHyperParameter('threshold', min_value=0.0, max_value=1.0)
noise = pyshac.NormalContinuousHyperParameter('noise', mean=0.0, std=1.0)
When setting up the SHAC engine, we need to define a few important parameters which will be used by the engine :
max
or min
. Defines whether the objective should be maximised or minimised.
import numpy as np
import pyshac
# define the parameters
param_x = pyshac.UniformContinuousHyperParameter('x', -5.0, 5.0)
param_y = pyshac.UniformContinuousHyperParameter('y', -2.0, 2.0)
parameters = [param_x, param_y]
# define the total budget as 100 evaluations
total_budget = 100 # 100 evaluations at maximum
# define the number of batches
num_batches = 10 # 10 samples per batch
# define the objective
objective = 'min' # minimize the squared loss
shac = pyshac.SHAC(parameters, total_budget, num_batches, objective)
To train a classifier, the user must define an Evaluation function. This is a user defined function, that accepts 2 or more inputs as defined by the engine, and returns a python floating point value.
The Evaluation Function receives at least 2 inputs :
list(parameters.values())
can be used to get the list of values in the same order as when the Parameters were declared to the engine.An example of a defined evaluation function :
# define the evaluation function
def squared_error_loss(id, parameters):
x = parameters['x']
y = parameters['y']
y_sample = 2 * x - y
# assume best values of x and y and 2 and 0 respectively
y_true = 4.
return np.square(y_sample - y_true)
A single call to shac.fit()
will begin training the classifiers.
There are a few cases to consider:
In these cases, we can utilize a few additional parameters to allow the training behaviour to better adapt to these circumstances. These parameters are :
# `early stopping` default is False, and it is preferred not to use it when using `relax checks`
shac.fit(squared_error_loss, skip_cv_checks=True, early_stop=False, relax_checks=True)
Once the models have been trained by the engine, it is as simple as calling predict()
to sample multiple samples or batches of parameters.
Samples can be obtained in a per instance or per batch (or even a combination) using the two parameters - num_samples
and num_batches
.
# sample a single instance of hyper parameters
parameter_samples = shac.predict() # Gets 1 sample.
# sample multiple instances of hyper parameters
parameter_samples = shac.predict(10) # Gets 10 samples.
# sample a batch of hyper parameters
parameter_samples = shac.predict(num_batches=5) # samples 5 batches, each containing 10 samples.
# sample multiple batches and a few additional instances of hyper parameters
parameter_samples = shac.predict(5, 5) # samples 5 batches (each containing 10 samples) and an additional 5 samples.
Examples based on the Branin
and Hartmann6
problems can be found in the Examples folder.
An example of how to use the TensorflowSHAC
engine is provided in the example foldes as well.
Comparison scripts of basic optimization, Branin
and Hartmann6
using Tensorflow Eager 1.8 are provided in the respective folders.
Brannin to close to the true optima as described in the paper.
Hartmann 6 was a much harder dataset, and results are worse than Random Search 2x and the one from the paper. Perhaps it was due to a bad run, and may be fixed with larger budget for training.
The task is to sample two parameters x
and y
, such that z = 2 * x - y
and we want z
to approach the value of 4. We utilize MSE
as the metric between z and the optimal value.
The task is to sample hyper parameters which provide high accuracy values using TensorflowSHAC engine.