AnonymousYou commented 6 years ago

I am facing issues while training and testing . Is the below code upto mark for training and testing ? Also the predictions are bad . As I am new to DL , if you could see.

import sys import gensim from dl_text import dl from keras.models import Model, model_from_json from keras.layers import Input, Dense, Dropout, Concatenate, Conv1D, Lambda, Flatten, MaxPooling1D

reload(sys) sys.setdefaultencoding('utf8')

def model_dnn(dimx, embedding_matrix): inpx = Input(shape=(dimx,), dtype='int32', name='inpx') embed = dl.word2vec_embedding_layer(embedding_matrix)(inpx) flat_embed = Flatten()(embed) nnet_h = Dense(units=10, activation='sigmoid')(flat_embed) nnet_out = Dense(units=1, activation='sigmoid')(nnet_h) model = Model([inpx], nnet_out) model.compile(loss='mse', optimizer='adam', metrics=['accuracy']) return model

raw_data = ['The voice response excellent', ' Very low volume', ' It works like a magic. Very good as personal assistant.', ' It Pretty Good', ' the only problem I saw was It down not answer all questions to us', ' Very very good I like much it WORK very good', ' Still a bit Buggy', ' This is my first device of this type It looks great', ' I can not hear her say anything it is not in mute or the volumn is low on the second day what happen, I will rate one or two for now but if it is fine again i will rate five', ' Excellent and intelligent product', ' This is so sad, Alexa play song', ' Apart from "Alexa" on many occasions it gets activated by hearing other words too. Not too expressive like siri.', ' For some reasons i have been facing so many issues with amazon. ', ' Happy to have been one of the early adopters post launch in India', ' Superb sound', ' Excellent voice recognition', ' It is somewhat ok product. I am not getting answers properly for some questions. ', ' My 1.5 years old daughter loves it.', ' Sound quality very bad', ' Lovely little device', ' Gets disconnected Everytime, unable to connect it back again', ' Also connecting to a big speaker via the 3.5 mm jack is a very nice feature', ' As an adult user i think echo has a long way to go', ' Not such a great product', ' It is not working on my new WiFi setup in Mumbai. Please arrange a expert advise', ' it’s only marketing strategy', ' No new features are being added to Alexa', ' Good investment and connect to the existing audio system. Better option than Amazon Echo', ' Unable to use. bad sound quality and poor voice recognition', ' Of no use. Waste of money if amazon is will to take back will return', ' Echo Dot is Excellent it is doing it is job perfectly', ' It cannot keep up the conversation', ' Device looks very small and sound is bit little lower', ' Still an amateur product from Amazon.']

labels = [1, 0, 1, 1, 0, 1, 0, 1, 0, 1, 0, 0, 0, 1, 1, 1, 0, 1, 0, 1, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0] data = [] for sent in raw_data: data.append(dl.clean(sent))

print data data_inp = dl.process_data(sent_l=data, dimx=15) print(data_inp)

wordVec_model = gensim.models.KeyedVectors.load_word2vec_format( '/home/shiva/Documents/Sentiment/GoogleNews-vectors-negative300.bin.gz', binary=True)

data_inp, embedding_matrix = dl.process_data(sent_l=data, wordVec_model=wordVec_model, dimx=10) model = model_dnn(dimx=10, embedding_matrix=embedding_matrix) print(model) model.fit(data_inp, labels, epochs=100) print(model.predict(data_inp))

print data_inp

model.save('/home/shiva/Documents/Sentiment/sentiment/my_model.h5')

scores = model.evaluate(data_inp, labels, verbose=0) print("%s: %.2f%%" % (model.metrics_names[1], scores[1] * 100))

serialize model to JSON

model_json = model.to_json() with open("model.json", "w") as json_file: json_file.write(model_json)

serialize weights to HDF5

model.save_weights("model.h5") print("Saved model to disk")

later...

load json and create model

json_file = open('model.json', 'r') loaded_model_json = json_file.read() json_file.close() loaded_model = model_from_json(loaded_model_json)

load weights into new model

loaded_model.load_weights("model.h5") print("Loaded model from disk")

evaluate loaded model on test data

loaded_model.compile(loss='mse', optimizer='adam', metrics=['accuracy']) data = ['the voice is very buggy', ' device is excellent'] data_inp = dl.process_data(sent_l=data, dimx=10) print(data_inp) print(loaded_model.predict(data_inp))

GauravBh1010tt commented 6 years ago

Can you post the issue?

AnonymousYou commented 6 years ago

I ran the above code and there is a problem of overfitting . Getting 99% accuracy while training and during testing achieving 30-60 % accuracy.

GauravBh1010tt commented 6 years ago

What is the size of your dataset? You need to have a sufficient dataset for your model to learn anything.

AnonymousYou commented 6 years ago

It is merely 35 sentences. I get it . Could you suggest some dataset for the purpose of training ?

GauravBh1010tt commented 6 years ago

Have a look at the following links:-

https://www.kaggle.com/c/sentiment-analysis-on-movie-reviews https://www.kaggle.com/bittlingmayer/amazonreviews https://datascience.stackexchange.com/questions/11220/training-dataset-for-sentiment-analysis-of-movie-reviews

GauravBh1010tt / DL-text

some problem with training and testing. #7

print data_inp

model.save('/home/shiva/Documents/Sentiment/sentiment/my_model.h5')

serialize model to JSON

serialize weights to HDF5

later...

load json and create model

load weights into new model

evaluate loaded model on test data