Closed mathkidsz closed 7 years ago
If I understand correctly the Dense layer should have 4 units and the activation has to be softmax. It's a sigmoid for multiple values. The 4 classes will sum to 1.
The loss function should be categorical_crossentropy, which is the same as binary but for multiple classes.
model.add(Dense(4, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
But this requires your labels to be one-hot encoded. Class A would be [1,0,0,0], class B [0,1,0,0] etc.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 30 days if no further activity occurs, but feel free to re-open a closed issue if needed.
Hello,
I am currently working on a sentiment analysis-type problem for classifying math word problems into addition/subtraction/multiplication/division.
I have been trying to no avail to get a basic LSTM working for multiple (more than 2) categories of output. My current code is below for 2 categories, adapted from the IMDB sentiment analysis problem.
No matter what I try changing there is some error or another with the LSTM's output (seems to be always of size (num_inputs, 1), so I can't feed it to a 4-neuron dense layer). Can anyone help?
import numpy from keras.datasets import imdb from keras.models import Sequential from keras.layers import Dense from keras.layers import LSTM from keras.layers.embeddings import Embedding from keras.layers import Dropout from keras.preprocessing import sequence from LSTMattempt import load_data import pdb
fix random seed for reproducibility
numpy.random.seed(7)
load the dataset but only keep the top n words, zero the rest
top_words = 1000 (X_train, y_train), (X_test, y_test) = load_data(top_words) print("x_train", X_train) print("y_train",y_train)
pdb.set_trace()
exit(0)
truncate and pad input sequences
max_review_length = 50 X_train = sequence.pad_sequences(X_train, maxlen=max_review_length) X_test = sequence.pad_sequences(X_test, maxlen=max_review_length)
create the model
embedding_vecor_length = 32 model = Sequential() model.add(Embedding(top_words, embedding_vecor_length, input_length=max_review_length)) model.add(LSTM(100)) model.add(Dropout(0.5)) model.add(Dense(1, activation='sigmoid')) model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy']) print(model.summary()) model.fit(X_train, y_train, nb_epoch=35, batch_size=64)
Final evaluation of the model
scores = model.evaluate(X_test, y_test, verbose=0) print("Accuracy: %.2f%%" % (scores[1]*100))