Open JayPhate opened 7 years ago
HTH, Udi
you are welcomed to do the changes and send a PR
@udibr I am very new to Keras and NN. Why do we need to use np_utils.to_categorical? Because I already converted all the vocab words to indices. So I am not using words for training and I am using only indices. I am trying to build Many to Many sequence labeling model. The fifth model from left in below image.
My issue is very similar to https://github.com/fchollet/keras/issues/2654 . From this link, I didn't understand the input and output shapes of the data.
Again you suggested to add an extra dimension of size 1 to yTrain, Can you elaborate why? Now the shape of yTrain is => (17853, 25), what will be the shape after adding an extra dimension?
I want to use Encoder-Decoder model for some other data. I am trying to understand this code. But I couldn't find the fit method in train.ipynb. After padding of description and heading, how to use these vector for training the model. What is the dimension for X and Y in model-dot-fit? The dimension of X may be #descriptions x 50 and the dimension of Y may be #headings x 50. And #descriptions equals to #headings.
Below is the command I used to fit the model.
model_fit = model.fit(nxTrain, nyTrain, nb_epoch=1, batch_size=64, verbose=2)
The dimensions of X and Y of model.fit method.xTrain.shape
(17853, 50)
yTrain.shape
(17853, 25)
But I got below error.
Exception: Error when checking model target: expected activation_1 to have 3 dimensions, but got array with shape (17853, 25)
Please check the model summary.
print(model.summary())
Layer (type) Output Shape Param # Connected to
embedding_1 (Embedding) (None, 50, 100) 4000000 embedding_input_1[0][0]
lstm_1 (LSTM) (None, 50, 512) 1255424 embedding_1[0][0]
dropout_1 (Dropout) (None, 50, 512) 0 lstm_1[0][0]
lstm_2 (LSTM) (None, 50, 512) 2099200 dropout_1[0][0]
dropout_2 (Dropout) (None, 50, 512) 0 lstm_2[0][0]
lstm_3 (LSTM) (None, 50, 512) 2099200 dropout_2[0][0]
dropout_3 (Dropout) (None, 50, 512) 0 lstm_3[0][0]
simplecontext_1 (SimpleContext) (None, 25, 944) 0 dropout_3[0][0]
timedistributed_1 (TimeDistribut (None, 25, 40000) 37800000 simplecontext_1[0][0]
activation_1 (Activation) (None, 25, 40000) 0 timedistributed_1[0][0]
Total params: 47253824
None
I used the same model as explained in train.ipynb. I am not getting what's wrong here?