Tuner search epoch and batch size when using custom data generator for LSTM model

saranyaprakash2012 commented 3 years ago

I would like to use a data generator and tune epochs and batch size of a BLSTM Model. How do I pass the generator to the trial function?

Batch Generator :

def batch_generator(ids, batch_size = BATCH_SIZE):
        batch=[]
        while True:
                np.random.shuffle(ids) 
                for i in ids:
                    batch.append(i)
                    if len(batch)==batch_size:
                        print(f"No ids in batch:{len(batch)}")
                        yield load_data(batch)
                        batch=[]

Model :

def create_model_with_param(units=100,dropout=0.01,dense_units=600,dense2_units=90,dense3_units=10,learning_rate=0.0001):
    K.clear_session()
    inputs_eGeMAPS = Input(EGEMAPS_INPUT_SHAPE)
    inputs_MFCC = Input(MFCC_INPUT_SHAPE)
    inputs_Densenet = Input(DENSENET201_INPUT_SHAPE)

    blstm1_eGeMAPS = Bidirectional(LSTM(units=units,return_sequences=True,recurrent_dropout =0,activation='tanh'))(inputs_eGeMAPS)
   # print(f"blstm1_eGeMAPS:{blstm1_eGeMAPS}")

    blstm2_eGeMAPS= Bidirectional(LSTM(units=units,recurrent_dropout =0,activation='tanh'))(blstm1_eGeMAPS)
  # print(f"blstm1_eGeMAPS:{blstm1_eGeMAPS}")
    eGeMAPS_dropout = Dropout(rate= dropout)(blstm2_eGeMAPS)

    blstm1_MFCC = Bidirectional(LSTM(units=units,return_sequences=True,recurrent_dropout =0,activation='tanh'))(inputs_MFCC)
  # print(f"blstm1_MFCC:{blstm1_MFCC}")

    blstm2_MFCC= Bidirectional(LSTM(units=units,recurrent_dropout =0,activation='tanh'))(blstm1_MFCC)
  # print(f"blstm1_MFCC:{blstm1_MFCC}")
    MFCC_dropout = Dropout(rate= dropout)(blstm2_MFCC)
  # blstm1_Densenet = Bidirectional(LSTM(100,return_sequences=True,recurrent_dropout =0.2,activation='relu'))(inputs_Densenet)
    blstm1_Densenet = Bidirectional(LSTM(200,return_sequences=True,recurrent_dropout =0,activation='tanh'))(inputs_Densenet)

    print(f"blstm1_Densenet:{blstm1_Densenet}")

  # blstm2_Densenet= Bidirectional(LSTM(100,recurrent_dropout =0.2,activation='relu'))(blstm1_Densenet)
    blstm2_Densenet= Bidirectional(LSTM(units=units,recurrent_dropout =0,activation='tanh'))(blstm1_Densenet)
    print(f"blstm1_Densenet:{blstm2_Densenet}")

    Densenet_Dropout = Dropout(rate= dropout)(blstm2_Densenet)
    audio_lstm_output = concatenate([eGeMAPS_dropout,MFCC_dropout,Densenet_Dropout])

    dense1= Dense(units=dense_units,activation='relu')(audio_lstm_output)
  #     dense1= Dense(500,activation='relu')(blstm2_eGeMAPS)

    print(f"dense:{dense1}")

    dense2= Dense(units=dense2_units,activation='relu')(dense1)
    print(f"dense:{dense2}")

    dense3= Dense(units=dense3_units,activation='relu')(dense2)
    print(f"dense:{dense3}")

    output = Dense(1,activation='sigmoid')(dense3)
    print(f"output:{output}")

  #     run_opts = tf.RunOptions(report_tensor_allocations_upon_oom = True)

    model = Model(inputs=[inputs_eGeMAPS,inputs_MFCC,inputs_Densenet], outputs=output)
  #     model = Model(inputs=[inputs_eGeMAPS], outputs=output)

    METRICS = [
        metrics.TruePositives(name='tp'),
        metrics.FalsePositives(name='fp'),
        metrics.TrueNegatives(name='tn'),
        metrics.FalseNegatives(name='fn'), 
        metrics.BinaryAccuracy(name='accuracy'),
        metrics.Precision(name='precision'),
        metrics.Recall(name='recall'),
        metrics.AUC(name='auc'),
    ]

    model.compile(optimizer=optimizers.Adam(learning_rate),
                loss=losses.BinaryCrossentropy(),
                metrics=METRICS)
    print(model.summary())
    return model

Tuner :

class MyTuner(BayesianOptimization):
  def run_trial(self, trial, *args, **kwargs):
    # You can add additional HyperParameters for preprocessing and custom training loops
    # via overriding `run_trial`
    kwargs['batch_size'] = trial.hyperparameters.Int('batch_size', 4, 16, step=4)
    kwargs['epochs'] = trial.hyperparameters.Int('epochs', 10, 30,step=5)
    super(MyTuner, self).run_trial(trial, *args, **kwargs)

# Uses same arguments as the BayesianOptimization Tuner.
tuner = MyTuner(create_model_with_param,
    objective='recall',
    max_trials=3,
    executions_per_trial=1,
    directory=os.path.normpath('keras_tuning_blstm_audio'),
    project_name='kerastuner_bayesian_lstm_audio',overwrite=True)
# Don't pass epochs or batch_size here, let the Tuner tune them.

Usual tuner call for epoch training - where full dataset is passed to the search function

tuner.search(X_train, Y_train,validation_split=0.2,verbose=1)

Tuner call when training other params where the epochs and batch size is fixed, data is passed as a generator

train_data_gen= batch_generator(train_upsample_files,BATCH_SIZE)
val_data_gen= batch_generator(val_files,VAL_BATCH_SIZE)

steps_per_epoch= math.floor(len(train_upsample_files)/BATCH_SIZE) 
val_steps_per_epoch= math.floor(len(val_files)/VAL_BATCH_SIZE)
bayesian_opt_tuner.search(train_data_gen,epochs=n_epochs,steps_per_epoch=steps_per_epoch,validation_data= val_data_gen,validation_steps=val_steps_per_epoch)

If I add the data generator and model. fit within the model function, will the optimizer use the model. fit output appropriately?

haifeng-jin commented 3 years ago

You can pass whatever objects to the tuner.search(...) function as x and y, for example, your files. Then, you override the search, in which you just wrap the passed x and y to generators using the hp for batch_size, and pass the generators to the fit function. You can read the source code of search function here. You will understand how to do it.

saranyaprakash2012 commented 3 years ago

You can pass whatever objects to the tuner.search(...) function as x and y, for example, your files. Then, you override the search, in which you just wrap the passed x and y to generators using the hp for batch_size, and pass the generators to the fit function. You can read the source code of search function here. You will understand how to do it.

Thank you for the suggestion. This is definitely the way to go for me to tune the model. I have overridden the tuner.run_trial function. Is there a difference in behavior between overriding search and run_trial, esp when I want to use multiple GPUs?

haifeng-jin commented 3 years ago

I don't think there is any difference. They are just calling the keras model fit function.

keras-team / keras-tuner

Tuner search epoch and batch size when using custom data generator for LSTM model #503