Closed ogreyesp closed 5 years ago
@ogreyesp Thanks for the issue!
This comment is updated by @haifeng-jin because it was out-of-date. Following is the latest recommended way of doing it:
This is a barebone code for tuning batch size.
The *args
and **kwargs
are the ones you passed from tuner.search()
.
class MyHyperModel(kt.HyperModel):
def build(self, hp):
model = keras.Sequential()
model.add(layers.Flatten())
model.add(
layers.Dense(
units=hp.Int("units", min_value=32, max_value=512, step=32),
activation="relu",
)
)
model.add(layers.Dense(10, activation="softmax"))
model.compile(
optimizer="adam", loss="categorical_crossentropy", metrics=["accuracy"],
)
return model
def fit(self, hp, model, *args, **kwargs):
return model.fit(
*args,
batch_size=hp.Choice("batch_size", [16, 32]),
**kwargs,
)
tuner = kt.RandomSearch(
MyHyperModel(),
objective="val_accuracy",
max_trials=3,
overwrite=True,
directory="my_dir",
project_name="tune_hypermodel",
)
For epochs specifically, I'd alternatively recommend looking at using early stopping during training via passing in the tf.keras.callbacks.EarlyStopping
callback if it's applicable to your use case. This can be configured to stop your training as soon as the validation loss stops improving. You can pass Keras callbacks like this to search
:
# Will stop training if the "val_loss" hasn't improved in 3 epochs.
tuner.search(x, y, epochs=30, callbacks=[tf.keras.callbacks.EarlyStopping('val_loss', patience=3)])
For n-fold cross validation, you can also just do it in HyperModel.fit()
and return the result as a dictionary like {"val_accuracy": 0.3}
, where the key is the name of the objective
.
Please follow this guide for more details.
Thanks @omalleyt12.
Your response is very helpful.
This project is very important and useful for me. However, the lack of documentation and tutorials is hampering its use.
For example, how can I determine the best subset of hyperparameters by conducting a cross validation?
This comment is updated by @haifeng-jin because it was out-of-date. Please use the code snippets above instead.
Please see pending PR here with a tutorial: https://github.com/keras-team/keras-tuner/pull/136
Is it possible to do tuning without creating a class?
Thanks for the explanation on batch size. However, when I retrieve the parameters of the best model by tuner.get_best_hyperparameters()[0] and take a look at the values through .get_config()["values"] the batch_size is not listed there. How can I retrieve the hyperparameter "batch size" when doing the search in the way described here.
@omalleyt12 @VincBar Was this issue resolved? Using KerasTuner for epoch and batch_size right now, too. Not very keen to have invisible results after 10hrs of running.
@tolandwehr hey, I dont know if the direct way is solved, but I went around by inculding the batchsize hyperparam in the hypermodel and save it to self.batch_size ( or in my case actually a dictionary with some other stuff) and define a fit function in my hypermodel that then takes this (and whatever else the fit might need).
@VincBar Sounds interesting. Could you give a code, if still available ^^'?
@omalleyt12
other issue: got a NaN/Inf error after some hours of iterations... which is strange, cause I double checked the dataset with
.isnull().sum().sum()
and there were no NaNs
ValueError Traceback (most recent call last)
<ipython-input-666-7713a18234fe> in <module>
----> 1 tuner.search(X_train, y_train, epochs=40, validation_split=0.1, callbacks=[tf.keras.callbacks.EarlyStopping('val_loss', patience=3)])
~\Anaconda3\envs\Tensorflow\lib\site-packages\kerastuner\engine\base_tuner.py in search(self, *fit_args, **fit_kwargs)
118 self.on_search_begin()
119 while True:
--> 120 trial = self.oracle.create_trial(self.tuner_id)
121 if trial.status == trial_module.TrialStatus.STOPPED:
122 # Oracle triggered exit.
~\Anaconda3\envs\Tensorflow\lib\site-packages\kerastuner\engine\oracle.py in create_trial(self, tuner_id)
147 values = None
148 else:
--> 149 response = self._populate_space(trial_id)
150 status = response['status']
151 values = response['values'] if 'values' in response else None
~\Anaconda3\envs\Tensorflow\lib\site-packages\kerastuner\tuners\bayesian.py in _populate_space(self, trial_id)
101 x, y = self._vectorize_trials()
102 try:
--> 103 self.gpr.fit(x, y)
104 except exceptions.ConvergenceWarning:
105 # If convergence of the GPR fails, create a random trial.
~\Anaconda3\envs\Tensorflow\lib\site-packages\sklearn\gaussian_process\_gpr.py in fit(self, X, y)
232 optima = [(self._constrained_optimization(obj_func,
233 self.kernel_.theta,
--> 234 self.kernel_.bounds))]
235
236 # Additional runs are performed from log-uniform chosen initial
~\Anaconda3\envs\Tensorflow\lib\site-packages\sklearn\gaussian_process\_gpr.py in _constrained_optimization(self, obj_func, initial_theta, bounds)
501 opt_res = scipy.optimize.minimize(
502 obj_func, initial_theta, method="L-BFGS-B", jac=True,
--> 503 bounds=bounds)
504 _check_optimize_result("lbfgs", opt_res)
505 theta_opt, func_min = opt_res.x, opt_res.fun
~\AppData\Roaming\Python\Python36\site-packages\scipy\optimize\_minimize.py in minimize(fun, x0, args, method, jac, hess, hessp, bounds, constraints, tol, callback, options)
608 elif meth == 'l-bfgs-b':
609 return _minimize_lbfgsb(fun, x0, args, jac, bounds,
--> 610 callback=callback, **options)
611 elif meth == 'tnc':
612 return _minimize_tnc(fun, x0, args, jac, bounds, callback=callback,
~\AppData\Roaming\Python\Python36\site-packages\scipy\optimize\lbfgsb.py in _minimize_lbfgsb(fun, x0, args, jac, bounds, disp, maxcor, ftol, gtol, eps, maxfun, maxiter, iprint, callback, maxls, **unknown_options)
343 # until the completion of the current minimization iteration.
344 # Overwrite f and g:
--> 345 f, g = func_and_grad(x)
346 elif task_str.startswith(b'NEW_X'):
347 # new iteration
~\AppData\Roaming\Python\Python36\site-packages\scipy\optimize\lbfgsb.py in func_and_grad(x)
293 else:
294 def func_and_grad(x):
--> 295 f = fun(x, *args)
296 g = jac(x, *args)
297 return f, g
~\AppData\Roaming\Python\Python36\site-packages\scipy\optimize\optimize.py in function_wrapper(*wrapper_args)
325 def function_wrapper(*wrapper_args):
326 ncalls[0] += 1
--> 327 return function(*(wrapper_args + args))
328
329 return ncalls, function_wrapper
~\AppData\Roaming\Python\Python36\site-packages\scipy\optimize\optimize.py in __call__(self, x, *args)
63 def __call__(self, x, *args):
64 self.x = numpy.asarray(x).copy()
---> 65 fg = self.fun(x, *args)
66 self.jac = fg[1]
67 return fg[0]
~\Anaconda3\envs\Tensorflow\lib\site-packages\sklearn\gaussian_process\_gpr.py in obj_func(theta, eval_gradient)
223 if eval_gradient:
224 lml, grad = self.log_marginal_likelihood(
--> 225 theta, eval_gradient=True, clone_kernel=False)
226 return -lml, -grad
227 else:
~\Anaconda3\envs\Tensorflow\lib\site-packages\sklearn\gaussian_process\_gpr.py in log_marginal_likelihood(self, theta, eval_gradient, clone_kernel)
474 y_train = y_train[:, np.newaxis]
475
--> 476 alpha = cho_solve((L, True), y_train) # Line 3
477
478 # Compute log-likelihood (compare line 7)
~\AppData\Roaming\Python\Python36\site-packages\scipy\linalg\decomp_cholesky.py in cho_solve(c_and_lower, b, overwrite_b, check_finite)
194 (c, lower) = c_and_lower
195 if check_finite:
--> 196 b1 = asarray_chkfinite(b)
197 c = asarray_chkfinite(c)
198 else:
~\Anaconda3\envs\Tensorflow\lib\site-packages\numpy\lib\function_base.py in asarray_chkfinite(a, dtype, order)
497 if a.dtype.char in typecodes['AllFloat'] and not np.isfinite(a).all():
498 raise ValueError(
--> 499 "array must not contain infs or NaNs")
500 return a
501
ValueError: array must not contain infs or NaNs
I would like to use Bayesian optimization tuner to tune epochs and batch size for a BLSTM model. My data is passed in using a custom data generator, which takes batch size as input. How do I use the Keras tuner in this case?
@ogreyesp Thanks for the issue!
This can be done by subclassing the
Tuner
class you are using and overridingrun_trial
. (Note thatHyperband
sets the epochs to train for via its own logic, so if you're usingHyperband
you shouldn't tune the epochs). Here's an example withkt.tuners.BayesianOptimization
:class MyTuner(kerastuner.tuners.BayesianOptimization): def run_trial(self, trial, *args, **kwargs): # You can add additional HyperParameters for preprocessing and custom training loops # via overriding `run_trial` kwargs['batch_size'] = trial.hyperparameters.Int('batch_size', 32, 256, step=32) kwargs['epochs'] = trial.hyperparameters.Int('epochs', 10, 30) super(MyTuner, self).run_trial(trial, *args, **kwargs) # Uses same arguments as the BayesianOptimization Tuner. tuner = MyTuner(...) # Don't pass epochs or batch_size here, let the Tuner tune them. tuner.search(...)
For epochs specifically, I'd alternatively recommend looking at using early stopping during training via passing in the
tf.keras.callbacks.EarlyStopping
callback if it's applicable to your use case. This can be configured to stop your training as soon as the validation loss stops improving. You can pass Keras callbacks like this tosearch
:# Will stop training if the "val_loss" hasn't improved in 3 epochs. tuner.search(x, y, epochs=30, callbacks=[tf.keras.callbacks.EarlyStopping('val_loss', patience=3)])
hello @ogreyesp, I have implemented this in the Hyperband keras tuner. I have a doubt, for the first trial, why the batch_size is not included and from the second trial onwards. Why is it so? Is there any way to include batch_size in first trial itself? Please let me know.
I used the following code to optimise the number of epochs and batch size:
class MyTuner(kerastuner.tuners.BayesianOptimization): def run_trial(self, trial, *args, **kwargs):
You can add additional HyperParameters for preprocessing and custom training loops
via overriding
run_trial
kwargs['batch_size'] = trial.hyperparameters.Int('batch_size', 32, 256, step=32) kwargs['epochs'] = trial.hyperparameters.Int('epochs', 10, 30) super(MyTuner, self).run_trial(trial, *args, **kwargs)
Now I want to save the number of epochs and batch size for the best trial that the tuner found.
I tried using the following code suggested by @fredshu, but I could not get it working:
values['batch_size'] = best_trial.batch_size
How is 'best_trial' defined? I use best_model = tuner.get_best_models()[0]
to get the best model to make predictions afterwards, if I replace best_trial with best_model it does not work.
I used with redirect_stdout(f): tuner.results_summary()
to save the full summary to a text file but now I only want to have the number of epochs and batch size of the best trial.
So how do I save the number of epochs and batch size of the best trial to seperate variables? If it is possible I would also like to save the other optimised hyperparameters.
I am new to Keras and Tensorflow. I want to simultaneously explore the number of epochs and the CV for my project. Can you please help me to write the custom Tuner?
@saranyaprakash2012 Did you manage to use a Keras training generator with a Keras Tuner that tunes the batch_size?
Can anyone give a code snippet that does that?
The above example @omalleyt12
gave didn't change the actual batch size that the training generator (ImageDataGenerator)
took.
I mean that in the log the Keras Tuner shows it printed as if the batch size was taken into consideration, but the actual log also showed that the training generator ignored the Keras Tuner batch_size and just took a predefined value...
Examples: The actual batch size was 128 on a debug dataset of 150~ samples, so we had 2 batches: `2/2 [==============================]
2/2 [==============================]
but in the hyper parameters of the tuner it showed
`Hyperparameter |Value |Best Value So Far
learning_rate |0.5 |0.5
decay |0.01 |0.01
momentum |0 |0
batch_size |2 |1
` (I only had 1,2 as the batch size options inside the tuner)
Try something like this :
def create_hypermodel(hp):
learning_rate=0.0001
K.clear_session()
inputs_pose_gaze = Input(POSE_GAZE_INPUT_SHAPE)
blstm1_pose_gaze = Bidirectional(LSTM(200,return_sequences=True,recurrent_dropout =0,activation='tanh'))(inputs_pose_gaze)
max_pooled_poze= GlobalMaxPooling1D()(blstm1_pose_gaze)
output = Dense(1,activation='sigmoid')(max_pooled_poze)
model = Model(inputs=[inputs_pose_gaze], outputs=output)
model.compile(optimizer=optimizers.Adam(learning_rate),
loss=losses.BinaryCrossentropy(),
)
print(model.summary())
return model
class MyTuner2(BayesianOptimization): def run_trial(self, trial, *args, **kwargs):
# via overriding `run_trial`
hp = trial.hyperparameters
kwargs['batch_size'] = hp.Int('batch_size', 4, 16, step=4)
# kwargs['val_batch_size'] = hp.Int('val_batch_size', 1, 4, step=1)
kwargs['epochs'] = hp.Int('epochs', 10, 25,step=5)
train_data_gen= video_batch_generator(train_upsample_files,hp.Int('batch_size',4,16,step=4))
print(f"batch_size:{hp.Int('batch_size',4,16,step=4)}")
val_data_gen= video_batch_generator(val_files,hp.Int('val_batch_size', 1, 4, step=1))
steps_per_epoch= math.floor(len(train_upsample_files)/hp.Int('batch_size',4,16,step=4))
val_steps_per_epoch= math.floor(len(val_files)/VAL_BATCH_SIZE)
early_stopping = tf.keras.callbacks.EarlyStopping(
monitor='auc',
verbose=1,
patience=2,
mode='max',
restore_best_weights=True)
model = self.hypermodel.build(hp)
model.fit(train_data_gen,steps_per_epoch=steps_per_epoch,epochs = hp.Int('epochs', 5, 20,step=5),callbacks=[early_stopping])
val_metrics = model.evaluate(val_data_gen,steps =val_steps_per_epoch,return_dict=True)
print(f"Evaluation val_metrics :{val_metrics}")
self.oracle.update_trial(
trial.trial_id, {'val_auc': val_metrics['auc']})
self.save_model(trial.trial_id, model)
# Uses same arguments as the BayesianOptimization Tuner.
tuner = MyTuner2(create_hypermodel,
objective=Objective("val_auc", direction="max"),
max_trials=6,
executions_per_trial=1,
directory=os.path.normpath('keras_tuning_blstm_video'),
project_name='kerastuner_bayesian_lstm_video',overwrite=True)
# Don't pass epochs or batch_size here, let the Tuner tune them.
tuner.search_space_summary()
tuner.search()
model_best_model_epoch_batch_size = tuner.get_best_models(num_models=1)
# model_tuned = model_best_model_epoch_batch_size[0]
print(tuner.get_best_hyperparameters()[0].get_config()["values"])
# filepath_best_model ="video_best_batch_model"
# model_best_model_epoch_batch_size.save(filepath_best_model)
Try something like this : @saranyaprakash2012
Could you make it more clear about what code goes into what block? The indentation is a bit confusing
Thanks!
This guide is out of date. Please follow this guide instead.
I had some problems with the below version. Namely, I couldn't make it to run with custom objective.
class MyTuner(kerastuner.tuners.BayesianOptimization): def run_trial(self, trial, *args, **kwargs): # You can add additional HyperParameters for preprocessing and custom training loops # via overriding `run_trial` kwargs['batch_size'] = trial.hyperparameters.Int('batch_size', 32, 256, step=32) kwargs['epochs'] = trial.hyperparameters.Int('epochs', 10, 30) super(MyTuner, self).run_trial(trial, *args, **kwargs)
I added the return statement and it fixed that
class MyTuner(kerastuner.tuners.BayesianOptimization): def run_trial(self, trial, *args, **kwargs): # You can add additional HyperParameters for preprocessing and custom training loops # via overriding `run_trial` kwargs['batch_size'] = trial.hyperparameters.Int('batch_size', 32, 256, step=32) kwargs['epochs'] = trial.hyperparameters.Int('epochs', 10, 30) return super(MyTuner, self).run_trial(trial, *args, **kwargs)
@ogreyesp Thanks for the issue!
This comment is updated by @haifeng-jin because it was out-of-date. Following is the latest recommended way of doing it:
This is a barebone code for tuning batch size. The
*args
and**kwargs
are the ones you passed fromtuner.search()
.class MyHyperModel(kt.HyperModel): def build(self, hp): model = keras.Sequential() model.add(layers.Flatten()) model.add( layers.Dense( units=hp.Int("units", min_value=32, max_value=512, step=32), activation="relu", ) ) model.add(layers.Dense(10, activation="softmax")) model.compile( optimizer="adam", loss="categorical_crossentropy", metrics=["accuracy"], ) return model def fit(self, hp, model, *args, **kwargs): return model.fit( *args, batch_size=hp.Choice("batch_size", [16, 32]), **kwargs, ) tuner = kt.RandomSearch( MyHyperModel(), objective="val_accuracy", max_trials=3, overwrite=True, directory="my_dir", project_name="tune_hypermodel", )
For epochs specifically, I'd alternatively recommend looking at using early stopping during training via passing in the
tf.keras.callbacks.EarlyStopping
callback if it's applicable to your use case. This can be configured to stop your training as soon as the validation loss stops improving. You can pass Keras callbacks like this tosearch
:# Will stop training if the "val_loss" hasn't improved in 3 epochs. tuner.search(x, y, epochs=30, callbacks=[tf.keras.callbacks.EarlyStopping('val_loss', patience=3)])
For n-fold cross validation, you can also just do it in
HyperModel.fit()
and return the result as a dictionary like{"val_accuracy": 0.3}
, where the key is the name of theobjective
. Please follow this guide for more details.
Curious - is this considered the proper approach for tuning batch_size
? It looks like this comment was edited in Feb 2022 so my assumption is yes, but I have not seen this approach in the docs (I could be missing them)
Yes, this is the official recommended approach. Thanks
I am also trying to tune the batch_size
and could use some help here please:
class MyHyperModel(keras_tuner.HyperModel):
def build(self, hp):
model = Sequential(name='Conv1D_Model')
model.add(InputLayer((timesteps, input_dim), name='input_layer'))
for j in range(hp.Int("num_conv_layers", 1, 2)):
model.add(Conv1D(filters=hp.Int(f'filters_{j}', min_value=32, max_value=256, step=32),
kernel_size=hp.Int('kernel_size', min_value=2, max_value=6, step=2),
activation='tanh',
name=f'{j}_conv_layer'))
model.add(MaxPooling1D(pool_size=1))
model.add(Flatten())
if hp.Boolean("dropout"):
model.add(Dropout(rate=0.25))
for k in range(hp.Int("num_layers", 1, 3)):
model.add(Dense(units=hp.Int(f'units_{k}', min_value=24, max_value=72, step=24),
activation='tanh',
name=f'{k}_dense'))
model.add(Dense(units=1,
activation='tanh',
name='output_layer'))
model.compile(optimizer='adam',
loss='mean_squared_error')
return model
def fit(self, hp, model, *args, batch_size=32,
**kwargs):
return model.fit(
*args,
batch_size=hp.Choice("batch_size", [16, 32, 64]),
**kwargs,
)
But in the search space i got this:
Search space summary
Default search space size: 6
num_conv_layers (Int)
{'default': None, 'conditions': [], 'min_value': 1, 'max_value': 2, 'step': 1, 'sampling': None}
filters_0 (Int)
{'default': None, 'conditions': [], 'min_value': 32, 'max_value': 256, 'step': 32, 'sampling': None}
kernel_size (Int)
{'default': None, 'conditions': [], 'min_value': 2, 'max_value': 6, 'step': 2, 'sampling': None}
dropout (Boolean)
{'default': False, 'conditions': []}
num_layers (Int)
{'default': None, 'conditions': [], 'min_value': 1, 'max_value': 3, 'step': 1, 'sampling': None}
units_0 (Int)
{'default': None, 'conditions': [], 'min_value': 24, 'max_value': 72, 'step': 24, 'sampling': None}
None
Here is the first trial:
Search: Running Trial #1
Value |Best Value So Far |Hyperparameter
1 |? |num_conv_layers
160 |? |filters_0
4 |? |kernel_size
False |? |dropout
1 |? |num_layers
72 |? |units_0
Shouldn't the batch size appears both in the search space and the trial report? How do i know wich batch size is being used?
Hi,
How I can tune the number of epochs and batch size?
The provided examples always assume fixed values for these two hyperparameters.