autonomio / talos

Hyperparameter Experiments with TensorFlow and Keras
https://autonom.io
MIT License
1.62k stars 269 forks source link

Getting error while using Talos with LSTM model #380

Closed denizkenankilic closed 4 years ago

denizkenankilic commented 4 years ago

I have data frame with 2 columns (say c1 and c2) and 9830 observations. I am trying to predict c1 by using c1 and c2. I divided dataset into training and test as (6860,2) and (2970,2) respectively. Afterwards, I prepared trainX (6850,10,2), trainY (6850,) and testX (2960,10,2), testY (2960) datasets where time step is 10 in order to create datasets for LSTM type.

Moreover, for Talos input (x and y) I have selected 2 columns for x (9830, 2) and selected first column for y (9830,).

I will be grateful if you could help me. Am I using parameters in sequential model correctly?

My functions for LSTM model and Talos are given as below:

def talos_opt():

# set the parameter space
p = {'first_neuron':[2, 3, 4],
     'second_neuron':[2, 3, 4],
     'hidden_layers':[0, 1, 2],
     'batch_size': [10, 20 ,30],
     'epochs': [30, 50, 100],
     'dropout': [0, 0.2, 0.4],
     'recurrent_dropout': [0, 0.2, 0.4],
     'kernel_initializer': ['uniform','normal'],
     'optimizer': ['Nadam', 'Adam', 'sgd'],
     'losses': ['binary_crossentropy', 'mean_squared_error', 'mean_absolute_error'],
     'activation':['relu', 'elu'],
     'middle_activation':['tanh', 'sigmoid', 'relu'],
     'last_activation': ['softmax', 'sigmoid', 'relu']}

# first we have to make sure to input data and params into the function
def create_model(trainX, trainY, testX, testY, params):

    lstm_model = Sequential()
    lstm_model.add(LSTM(params['first_neuron'], activation=params['activation'], 
                        batch_input_shape=(params['batch_size'], TIME_STEPS, trainX.shape[2]), 
                        dropout=params['dropout'], recurrent_dropout=params['recurrent_dropout'],
                        kernel_initializer=params['kernel_initializer'],
                        return_sequences=False))

    lstm_model.add(Dropout(params['dropout']))

    lstm_model.add(Dense(4,activation='relu'))

    lstm_model.add(Dense(params['second_neuron'], activation=params['last_activation']))

    lstm_model.compile(loss=params['losses'],
                  optimizer=params['optimizer'](),
                  metrics=['acc', 'fmeasure_acc', 'mae'])

    model_out = lstm_model.fit(trainX, trainY, 
                        validation_data=[testX, testY],
                        batch_size=params['batch_size'],
                        callbacks=[history],
                        epochs=params['epochs'],
                        verbose=0)

    return model_out, lstm_model 

scan_object = ta.Scan(x, y, model=create_model, params=p, grid_downsample=0.1)

return scan_object

I am getting following error; File "", line 23, in create_model batch_input_shape=(params['batch_size'], TIME_STEPS, trainX.shape[2]),

IndexError: tuple index out of range

I changed "batch_input_shape=(params['batch_size'], TIME_STEPS, trainX.shape[2])," part by "input_shape=(10,2)," and "batch_input_shape=(10,10,2)," one by one. At this time I got error "ValueError: Error when checking input: expected lstm_12_input to have 3 dimensions, but got array with shape (6881, 2)" for both cases.

denizkenankilic commented 4 years ago

In the Scan part, is it correct to use x and y? Otherwise do I need to use trainX and trainY? I have 2 time series (c1 and c2) and I am trying to predict one of them (c1) by using both. Again I am defining size of data: trainX --> (6850, 10 ,2) trainY --> (6850, ) testX -->(2960, 10 ,2) testY --> (2960, ) x -->(9830, 2) y --> (9830, )

mikkokotila commented 4 years ago

In your case you want to use explicitly declare both in Scan(). So for example...

Scan(x=x_train, y=y_train, x_val=x_val, y_val=y_val)

Let me know how it goes.

denizkenankilic commented 4 years ago

Hi again,

then error occured as "TypeError: init() got multiple values for argument 'model'".

Then I tried "scan_object = ta.Scan(trainX, trainY, model=create_model, params=p, grid_downsample=0.1)" and it worked. However I don't know if I use Scan correctly. Because on "https://github.com/autonomio/talos/blob/master/talos/scan/Scan.py" it is written that x : ndarray 1d or 2d array, or a list of arrays with features for the prediction task. y : ndarray 1d or 2d array, or a list of arrays with labels for the prediction task. I used 3d arrays trainX and trainY (it is written for 1d or 2d, but I am using 3d now for LSTM, is it correct usage?), is it ok to use only training part in "Scan", or do I need to use whole data (i.e. prepearing 9830 observations with 2 columns in 3d array by using time steps).

r.best_params() Out[90]: array([[0.4, 4, 'relu', 0, 'uniform', 'relu', 'relu', 30, 'Adam', 0, 4, 'mean_absolute_error', 10, 0], [0.4, 2, 'relu', 0, 'normal', 'relu', 'relu', 30, 'Adam', 0, 4, 'mean_absolute_error', 10, 1], [0.4, 4, 'relu', 2, 'uniform', 'relu', 'relu', 30, 'Adam', 0, 4, 'mean_squared_error', 10, 2], [0.4, 2, 'relu', 2, 'uniform', 'relu', 'relu', 30, 'Adam', 0, 4, 'mean_squared_error', 10, 3]], dtype=object)

Actually there were 4 round in this experiment. But which one is the best? Is there any other way to see only best result?

gives "AxisError: axis 1 is out of bounds for array of dimension 1" error.

gives "raise ValueError('Must pass 2-d input') ValueError: Must pass 2-d input" error.

Sorry, if I am asking too many questions. Thanky you so much.

mikkokotila commented 4 years ago

No, you have to definitely declare the argument names as in Scan(x=x_train, y=y_train, x_val=x_val, y_val=y_val) because that is not the expected order of the params. But ok, good that you got things working otherwise.

LSTM models definitely work, as well as almost any other kind of Keras model.

Regarding best model, you can choose the number of models to be returned to be 1.

mikkokotila commented 4 years ago

I'm closing here as the actual issue is resolved. Feel free to open new issue if anything.