keras-team / keras

Deep Learning for humans
http://keras.io/
Apache License 2.0
61.64k stars 19.42k forks source link

How to alter LSTM sentiment classification to work with multiple categories? #5853

Closed mathkidsz closed 7 years ago

mathkidsz commented 7 years ago

Hello,

I am currently working on a sentiment analysis-type problem for classifying math word problems into addition/subtraction/multiplication/division.

I have been trying to no avail to get a basic LSTM working for multiple (more than 2) categories of output. My current code is below for 2 categories, adapted from the IMDB sentiment analysis problem.

No matter what I try changing there is some error or another with the LSTM's output (seems to be always of size (num_inputs, 1), so I can't feed it to a 4-neuron dense layer). Can anyone help?

import numpy from keras.datasets import imdb from keras.models import Sequential from keras.layers import Dense from keras.layers import LSTM from keras.layers.embeddings import Embedding from keras.layers import Dropout from keras.preprocessing import sequence from LSTMattempt import load_data import pdb

fix random seed for reproducibility

numpy.random.seed(7)

load the dataset but only keep the top n words, zero the rest

top_words = 1000 (X_train, y_train), (X_test, y_test) = load_data(top_words) print("x_train", X_train) print("y_train",y_train)

pdb.set_trace()

exit(0)

truncate and pad input sequences

max_review_length = 50 X_train = sequence.pad_sequences(X_train, maxlen=max_review_length) X_test = sequence.pad_sequences(X_test, maxlen=max_review_length)

create the model

embedding_vecor_length = 32 model = Sequential() model.add(Embedding(top_words, embedding_vecor_length, input_length=max_review_length)) model.add(LSTM(100)) model.add(Dropout(0.5)) model.add(Dense(1, activation='sigmoid')) model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy']) print(model.summary()) model.fit(X_train, y_train, nb_epoch=35, batch_size=64)

Final evaluation of the model

scores = model.evaluate(X_test, y_test, verbose=0) print("Accuracy: %.2f%%" % (scores[1]*100))

atremblay commented 7 years ago

If I understand correctly the Dense layer should have 4 units and the activation has to be softmax. It's a sigmoid for multiple values. The 4 classes will sum to 1.

The loss function should be categorical_crossentropy, which is the same as binary but for multiple classes.

model.add(Dense(4, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

But this requires your labels to be one-hot encoded. Class A would be [1,0,0,0], class B [0,1,0,0] etc.

stale[bot] commented 7 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 30 days if no further activity occurs, but feel free to re-open a closed issue if needed.