keras-team / keras

Deep Learning for humans
http://keras.io/
Apache License 2.0
61.57k stars 19.42k forks source link

Help: 'Wrong number of dimensions: expected 3, got 2 with shape (32L, 60L).' in LSTM model #1641

Closed DanHenry4 closed 6 years ago

DanHenry4 commented 8 years ago

Hey everyone,

I'm trying to use custom data on the LSTM model, but it keeps giving shape errors. After reading some other issues along the same lines, I even tried reshaping the input data to size (nb_inputs, timestamps, 1) which looks approximately like (4200, 60, 1), but that returns an error that says a shape of (None, 4200, 60, 1) is no good. Any thoughts?

maxlen = 60
batch_size = 32

print('Loading data...')
(X_train, y_train), (X_test, y_test) = t.LoadData()
print(len(X_train), 'train sequences')
print(len(X_test), 'test sequences')

print('X_train shape:', X_train.shape)
print('X_test shape:', X_test.shape)

print('Build model...')
model = Sequential()
model.add(LSTM(128, input_shape=X_train.shape))

model.compile(loss='binary_crossentropy',
              optimizer='sgd',
              class_mode="categorical")

print("Train...")
model.fit(X_train, y_train, batch_size=batch_size, nb_epoch=3,
          validation_data=(X_test, y_test), show_accuracy=True)
score, acc = model.evaluate(X_test, y_test,
                            batch_size=batch_size,
                            show_accuracy=True)
print('Test score:', score)
print('Test accuracy:', acc)

Output:

Using Theano backend. Loading data... 4130 train sequences 1016 test sequences X_train shape: (4130L, 60L) X_test shape: (1016L, 60L) Build model... Train... Train on 4130 samples, validate on 1016 samples Epoch 1/3 Traceback (most recent call last): File "main.py", line 52, in validation_data=(X_test, y_test), show_accuracy=True) File "C:\Miniconda2\lib\site-packages\keras\models.py", line 507, in fit shuffle=shuffle, metrics=metrics) File "C:\Miniconda2\lib\site-packages\keras\models.py", line 226, in _fit outs = f(ins_batch) File "C:\Miniconda2\lib\site-packages\keras\backend\theano_backend.py", line 357, in call return self.function(*inputs) File "C:\Miniconda2\lib\site-packages\theano\compile\function_module.py", line 513, in call allow_downcast=s.allow_downcast) File "C:\Miniconda2\lib\site-packages\theano\tensor\type.py", line 169, in filter data.shape)) TypeError: ('Bad input argument to theano function with name "C:\Miniconda2\lib\site-packages\keras\backend\theano_backend.py:354" at index 0(0-based)', 'Wrong number of dimensions: expected 3, got 2 with shape (32L, 60L).')

wxs commented 8 years ago

I even tried reshaping the input data to size (nb_inputs, timestamps, 1)

You will need to do that, but then you shouldn't put X_train.shape as the input_shape parameter of your LSTM. It doesn't care about the total number of training points for the model architecture. You should be able to instead pass in input_shape=X_train.shape[1:].

DanHenry4 commented 8 years ago

Thank you! Reshaping the arrays and adding .shape[1:] lets it run. May I ask why the input shape needs to be .shape[1:]?

Also, (off topic) the output looks like:

Using Theano backend. Using gpu device 0: GeForce GTX 970 Loading data... 4262 train sequences 1083 test sequences X_train shape: (4262L, 60L, 1L) X_test shape: (1083L, 60L, 1L) Build model... Train... Train on 4262 samples, validate on 1083 samples Epoch 1/3 4262/4262 [==============================] - 48s - loss: -28.0929 - acc: 1.0000 - val_loss: -64.3520 - val_acc: 1.0000 Epoch 2/3 4262/4262 [==============================] - 48s - loss: -64.6251 - acc: 1.0000 - val_loss: -64.3520 - val_acc: 1.0000 Epoch 3/3 4262/4262 [==============================] - 48s - loss: -64.6251 - acc: 1.0000 - val_loss: -64.3520 - val_acc: 1.0000 1083/1083 [==============================] - 1s Test score: -64.3520374245 Test accuracy: 1.0

Is there a reason the loss is negative?

wxs commented 8 years ago

So X.shape is (samples, timesteps, dimension), but the model architecture doesn't care about many training examples (samples) you have. Once you've built the model you can feed it a hundred million examples, doesn't matter. So you don't pass that as a parameter when you build your model. So X.shape[1:] is just (timesteps, samples) the two dimensions that matter

Incidentally if you're on a Theano backend you also don't need to specify the number of timesteps, but you need to pass "None" for that dimension, then.So instead you would pass in (None, X.shape[2])

As to why your score is negative: there's still something a bit fishy with your model. Your LSTM has 128 output dimensions and then you're evaluating binary cross-entropy on that? Is your y target also 128 dimensional? If it's not, you probably meant to put a Dense(1) layer, bringing your output down to a single output that is compatible with y. Also if you're using a cross-entropy objective you want your output to be a probability distribution so you probably meant to put some sort of activation on top of it to normalize its output.

Or else you probably meant to use a different objective function.

Without knowing more about your data (for instance the size of your y matrix) it's hard for me to help further.

DanHenry4 commented 8 years ago

So X.shape[1:] is just (timesteps, samples) the two dimensions that matter

I'm guessing you meant (timesteps, dimension)?

That makes sense, though. Thank you for the information. As for the output data, yes, a _binary__crossentropy loss function doesn't make much sense, considering the data look like:

[ [5.45, 5.42, ..., 5.26], [5.25, 5.28, ..., 5.30], ... [5.12, 5.15, ..., 5.65] ], [5.13, 5.17, ..., 5.05]

Where the first list contains sequences of input (which are themselves lists), and the output is a single float value.

I've changed the model:

batch_size = 32

print('Loading data...')
(X_train, y_train), (X_test, y_test) = t.LoadData()
print(len(X_train), 'train sequences')
print(len(X_test), 'test sequences')

X_train = np.reshape(X_train, X_train.shape + (1,))
X_test = np.reshape(X_test, X_test.shape + (1,))

print('X_train shape:', X_train.shape)
print('X_test shape:', X_test.shape)

print('Build model...')
model = Sequential()
model.add(LSTM(1, input_shape=X_train.shape[1:]))

model.compile(loss='mse',
              optimizer='sgd',
              class_mode="categorical")

print("Train...")
model.fit(X_train, y_train, batch_size=batch_size, nb_epoch=3,
          validation_data=(X_test, y_test), show_accuracy=True)
score, acc = model.evaluate(X_test, y_test,
                            batch_size=batch_size,
                            show_accuracy=True)
print('Test score:', score)
print('Test accuracy:', acc)

And it now produces output closer to the desired result:

Using Theano backend. Loading data... 4109 train sequences 998 test sequences X_train shape: (4109L, 60L, 1L) X_test shape: (998L, 60L, 1L) Build model... Train... Train on 4109 samples, validate on 998 samples Epoch 1/3 4109/4109 [==============================] - 3s - loss: 26.1860 - acc: 1.0000 - val_loss: 26.4226 - val_acc: 1.0000 Epoch 2/3 4109/4109 [==============================] - 3s - loss: 26.1860 - acc: 1.0000 - val_loss: 26.4226 - val_acc: 1.0000 Epoch 3/3 4109/4109 [==============================] - 3s - loss: 26.1860 - acc: 1.0000 - val_loss: 26.4226 - val_acc: 1.0000 998/998 [==============================] - 0s Test score: 26.4226496511 Test accuracy: 1.0

I'll keep plugging away. :)

wxs commented 8 years ago

For most applications you would probably want more than 1 hidden state on your LSTM! You can put a Dense layer (or TimeDistributedDense) with an output dimension of 1 to project the hidden state down to 1 dimension on output, while still retaining more than 1 dimension of state. So something like:

model.add(LSTM(128, input_shape=X_train.shape[1:])) model.add(Dense(1)) model.add(Activation('sigmoid'))

shamsulmasum commented 7 years ago
from pandas import DataFrame
from pandas import Series
from pandas import concat
from pandas import read_csv
from sklearn.metrics import mean_squared_error
from sklearn.preprocessing import MinMaxScaler
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
from math import sqrt
from matplotlib import pyplot
import numpy as np

# frame a sequence as a supervised learning problem
def timeseries_to_supervised(data, lag=1):
    df = DataFrame(data)
    columns = [df.shift(i) for i in range(1, lag+1)]
    columns.append(df)
    df = concat(columns, axis=1)
    df.fillna(0, inplace=True)
    return df

# create a differenced series
def difference(dataset, interval=1):
    diff = list()
    for i in range(interval, len(dataset)):
        value = dataset[i] - dataset[i - interval]
        diff.append(value)
    return Series(diff)

# invert differenced value
def inverse_difference(history, yhat, interval=1):
    return yhat + history[-interval]

# scale train and test data to [-1, 1]
def scale(train):
    # fit scaler
    scaler = MinMaxScaler(feature_range=(-1, 1))
    scaler = scaler.fit(train)
    # transform train
    train = train.reshape(train.shape[0], train.shape[1])
    train_scaled = scaler.transform(train)

    return scaler, train_scaled

# inverse scaling for a forecasted value
def invert_scale(scaler, X, value):
    new_row = [x for x in X] + [value]
    array = np.array(new_row)
    array = array.reshape(1, len(array))
    inverted = scaler.inverse_transform(array)
    return inverted[0, -1]

def generate_features(x, forecast, window):
    """ Concatenates a time series vector x with forecasts from
        the iterated forecasting strategy.

    Arguments:
    ----------
        x:        Numpy array of length T containing the time series.
        forecast: Scalar containing forecast for time T + 1.
        window:   Autoregressive order of the time series model.
    """
    augmented_time_series = np.hstack((x, forecast))

    return augmented_time_series[-window:].reshape(1, -1)

    # fit an LSTM network to training data
def fit_lstm(train, batch_size, nb_epoch, neurons):
    X, y = train[:, 0:-1], train[:, -1]
    X = X.reshape(X.shape[0], 1, X.shape[1])
    model = Sequential()
    model.add(LSTM(neurons, batch_input_shape=(batch_size, X.shape[1], X.shape[2]), stateful=True))
    model.add(Dense(1))
    model.compile(loss='mean_squared_error', optimizer='adam')
    for i in range(nb_epoch):
        model.fit(X, y, epochs=1, batch_size=batch_size, verbose=0, shuffle=False)
        model.reset_states()
    return model

def iterative_forecast(model, x, window, H):
    """ Implements iterative forecasting strategy

    Arguments:
    ----------
        model: scikit-learn model that implements a predict() method
               and is trained on some data x.
        x:     Numpy array containing the time series.
        h:     number of time periods needed for the h-step ahead
               forecast
    """
    forecast = np.zeros(H)    
    forecast[0] = model.predict(x.reshape(1, -1))

    for h in range(1, H):
        features = generate_features(x, forecast[:h], window)

        forecast[h] = model.predict(features)

    return forecast

# load dataset
series = read_csv('shampoosales.csv', header=0, index_col=0, squeeze=True)

# transform data to be stationary
raw_values = series.values
diff_values = difference(raw_values, 1)

# transform data to be supervised learning
supervised = timeseries_to_supervised(diff_values, 1)
supervised_values = supervised.values

train = supervised_values[0:-12]
test = supervised_values[-12:]

# transform the scale of the data
scaler, train_scaled = scale(train)
# fit the model
lstm_model = fit_lstm(train_scaled, 1, 3000, 4)

yhat = iterative_forecast(lstm_model, train, 1, 10)
predictions = list()
predictions.append(yhat)

i am trying to discover an algorithm for iterative forecast using LSTM. seems to be something wrong with the code. would you be kind enough to help?

error that i am getting

'Error when checking : expected lstm_2_input to have 3 dimensions, but got array with shape (1, 46)'

stale[bot] commented 6 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 30 days if no further activity occurs, but feel free to re-open a closed issue if needed.

fshah7 commented 6 years ago

Hi,

I have an input data with three variables / dimensions. with 4080 total samples. I am trying the below RNN script but getting the error. Any help ?

model=Sequential() model.add(GRU(3,return_sequences=True,input_shape=(4080,3))) model.add(Dense(1)) model.compile(loss='categorical_crossentropy',optimizer='adam',metrics=['accuracy']) model.fit(x_train,dummy_y_train,nb_epoch=20,batch_size=20,verbose=1)

ERROR: Error when checking input: expected gru_1_input to have 3 dimensions, but got array with shape (4080, 3)

Khalid-Usman commented 6 years ago

@wxs Don't you think there is something fishy in @DanHenry4 work as he is getting the same loss after each epoch and accuracy is always 1 (100%), which is near to impossible in most of machine learning predictions and specially in stock price prediction. I am also getting the loss 0.0 , therefore i am confused, may be i did something wrong.

Please reply me on this, I am using LSTM for the first time and I am confused by seeing the accuracy that may be I am doing something wrong.

Your guidance will be appreciated. Thanks,

nantomar commented 5 years ago

if the matrix size is different in the test and the data on which model was trained than what can I do? Keras in r. my results are really poor just because I've to add dummy columns to match matrix size.