keras-team / keras-tuner

A Hyperparameter Tuning Library for Keras
https://keras.io/keras_tuner/
Apache License 2.0
2.86k stars 396 forks source link

Epochs argument in Hyperband search method #212

Open vb690 opened 4 years ago

vb690 commented 4 years ago

Is the epochs argument in the search() method redundant for Hyperband?

For what I understood the algorithm should "automatically" allocate the the number of epochs during the tuning process according to max_epochs.

From the documentation:

from kerastuner.applications import HyperResNet
from kerastuner.tuners import Hyperband

hypermodel = HyperResNet(input_shape=(128, 128, 3), num_classes=10)

tuner = Hyperband(
    hypermodel,
    objective='val_accuracy',
    max_trials=40, #I think max_trials is not a valid argument for Hyperband
    directory='my_dir',
    project_name='helloworld')

tuner.search(x, y,
             epochs=20, #what is the expected behavior here that max_epochs != epochs?
             validation_data=(val_x, val_y))
omalleyt12 commented 4 years ago

@vb690 Thanks for the issue!

Yeah it does seem redundant. We should probably infer this if it's not provided

tkmamidi commented 4 years ago

I'm still having the same issue. Is this resolved? or any plan to resolve in the near future? I am unable to stop the number of trials in hyperband.

vb690 commented 4 years ago

@tkmamidi could you please expand a bit more on I am unable to stop the number of trials in Hyperband, what do you exactly mean?

I ask you this because the issue I raised should simply be a case parameters redundancy and it should not create any real issue (at least this was my experience).

tkmamidi commented 4 years ago

Thanks for the comment. So, I'm using Hyperband in keras-tuner for tuning my model. I'm using the below command -

tuner = Hyperband(
    tune_layers_model,
    objective='val_accuracy',
    max_epochs = 2,
    hyperband_iterations=1,
    factor = 2,
    #max_trials=3,
    executions_per_trial=1,
    distribution_strategy=tf.distribute.MirroredStrategy(),
    directory='test_dir',
    project_name='b_tune_nn'
)

When I use max_trials, it throws me an error that it can't recognize the argument/paramater. How do I control how many trials I'm running? Please help!

vb690 commented 4 years ago

Hi @tkmamidi,

proceeding with order:

  1. I briefly looked at the current version of Keras Tuner documentation and max_trials is not a valid argument for Hyperband anymore(rightfully in my opinion).

  2. If you think about it and look at the details of the algorithm this does make perfect sense: Hyperband compute the maximum number of trials given the resources you allocate for training that in this case is given by max_epochs and hyperband_iterations.

  3. This means you don't have direct control on the number of trials the algorithm will run or better, you can compute that value given the value of max_epochs and hyperband_iterations (again see the resource linked above).

  4. As they suggest in the tutorial you want to set max_epochs to the number of epochs you expect your model will need to converge while also passing a callback for early termination of training (like EarlyStopping for instance). Since Hyperband is based on sampling random configurations of hyper-parameters and iteratively "breeding" the most promising ones you don't want to waste the computational budget allocated at any point of the optimization process (i.e. the trials) to configuration which are not optimal or that reached convergence early on.

  5. I have the felling that also executions_per_trial is not a valid argument (I can't find it even in the Tuner class).

  6. I believe you want to increase your value of max_epochs, at the moment you are sampling from an extremely small pool of random configurations.

Pinging @omalleyt12 for double checking what I've written.

fPkX6F1nGTX commented 2 years ago

I have the felling that also executions_per_trial is not a valid argument (I can't find it even in the Tuner class).

I am currently using HyperBand with two executions per trial. So it is a valid argument @vb690 and its purpose is to allow for averaging out some of the uncontrollable pseudo-randomness from, say, NVIDIA's GPU drivers which have non-deterministic behavior.