ColasGael / Machine-Learning-for-Solar-Energy-Prediction

Predict the Power Production of a solar panel farm from Weather Measurements using Machine Learning
MIT License
230 stars 105 forks source link

Just simple question about time step in the LSTM model :) #2

Closed joonv2 closed 5 years ago

joonv2 commented 5 years ago

Hi,

I have tested the LSTM model according to time-step. To change the number of time-step from None to 2, LSTM input_shape would be changed from (None,12) to (2, 12) ? or (2,6)? ,. I think (2,6) is right, and if I wanna set the time-step to 2, I think the shape of X_train and X_test should be changed.

SO, if the time step is 2, LSTM input_shape is changed (2,6) and X_train and X_test are changed to (train_data.shape[0], 2,6).

But,, it isnt working! :( what is the problem?

ps. I think you set the LSTM input_shape (None,1) and, this means full sequence is used to time step?

Alekxos commented 5 years ago

Hi joonv2, thanks for reaching out! When the model is trained with input shape (None, n), the RNN learns to adapt to input of any number of time steps. The number of time steps used is actually determined by the shape of the input and output train, dev, and test sets. For example, to change the number of time steps from 1 to n, modify the line X_train = np.reshape(train_data, (train_data.shape[0], 1, train_data.shape[1])) to: X_train = np.reshape(train_data, (train_data.shape[0] // n, n, train_data.shape[1])) The shape of the input data should always follow the pattern [num_samples, time_steps, num_features] when using the LSTM module in Keras. You should also modify it in the dev and train set lines, to ensure that the same number of time steps is used throughout dev, train, and test.

Changing the number of time steps means that the LSTM 'looks' at the data in groups of n. When you changed the input_shape from (train_data.shape[0], 1, 12) to (train_data.shape[0], 2, 6), you divided num_features by 2 rather than num_samples. num_samples should be divided by 2, since the RNN 'sees' half as many input samples when it 'looks' at them in pairs of 2.

Did you mean LSTM input_shape (None, 12) or (None, 1)? The shape (None, 1) refers to an intermediary input to a tanh layer, so this is not the original data's shape. The raw data's input_shape (None, 12) to the LSTM layer means that each input sample consists of a single sample with 12 features and a variable number of time steps. If you wanted to fix the number of time steps while training, you could also change None to n, but it must match the number of time steps you used in the corresponding X_train and Y_train reshaping (as shown above). You must also change it in both (None, 12) and (None, 1) since the number of time steps must be consistent from layer to layer. However, when you leave None as the value in both the LSTM and tanh layer input_shape, the RNN can automatically adapt to different numbers of time steps.

Thanks for the feedback. I will add a commit to make changing the number of time steps easier.

These two StackExchange posts may be helpful: https://stats.stackexchange.com/questions/377091/time-steps-in-keras-lstm https://stackoverflow.com/questions/38714959/understanding-keras-lstms?rq=1

I also highly recommend the following LSTM tutorials:

  1. http://colah.github.io/posts/2015-08-Understanding-LSTMs/
  2. http://karpathy.github.io/2015/05/21/rnn-effectiveness/

Let me know if you have any more questions.