Akshayc1 / named-entity-recognition

Name Entity Recognition using Python and Keras
44 stars 47 forks source link

Testing the model #1

Open sirisha-8 opened 4 years ago

sirisha-8 commented 4 years ago

Hi, I have queries regarding testing the model.could you please help with testing the below sentence Ex: Narendra Modi is the 15th prime minister of India from 26 May 2014 to 26 May 2019.

In which format should the above sentence be given to model.predict.It would be really helpful if you could share the snippet for testing the above sentence. @Akshayc1

Akshayc1 commented 4 years ago

Hello, You can do the same preprocessing as done for the X_test but instead only for your custom input sentence. I have predicted on one such sentence from testing data in the second last cell of the ipynb notebook. You can refer it from there.

sirisha-8 commented 4 years ago

Hi Akshay,

Thanks for the inputs.I tried the same preprocessing steps that u have done.Could you please verify the below snippet .

sen = 'Narendra Modi is the 15th prime minister of India from 26 May 2014 to 26 May 2019.' tokens = [] for x in sen.split(): if x not in tokens: tokens.append(x) word2indx = {w : i + 2 for i, w in enumerate(tokens)} word2indx["UNK"] = 1 word2indx["PAD"] = 0 idx2word = {i: w for w, i in word2indx.items()} X_1 = [[word2indx[w] for w in sen.split()] ] X_1 = pad_sequences(maxlen = max_len, sequences = X_1, padding = "post", value = word2indx["PAD"])

Here is the output for the above snippet :

Unique words in the sentence : ['Narendra', 'Modi', 'is', 'the', '15th', 'prime', 'minister', 'of', 'India', 'from', '26', 'May', '2014', 'to', '2019.']

Word to index dictionary : {'prime': 7, '26': 12, 'UNK': 1, 'from': 11, 'May': 13, 'of': 9, 'is': 4, 'India': 10, 'Narendra': 2, 'to': 15, 'PAD': 0, 'minister': 8, '2014': 14, 'Modi': 3, 'the': 5, '2019.': 16, '15th': 6}

Index to word dictionary : {0: 'PAD', 1: 'UNK', 2: 'Narendra', 3: 'Modi', 4: 'is', 5: 'the', 6: '15th', 7: 'prime', 8: 'minister', 9: 'of', 10: 'India', 11: 'from', 12: '26', 13: 'May', 14: '2014', 15: 'to', 16: '2019.'}

_Xtest array given to predict : [[ 2 3 4 5 6 7 8 9 10 11 12 13 14 15 12 13 16 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]]

I assume last output (_Xtest array given to predict) is the format to be given to model.predict...........Please correct me if i am wrong. @Akshayc1

Thanks.......

happy-mammal commented 2 years ago

Hi @sirisha-8 @Akshayc1 did you founded out any way or solution to pass the custom sentence in particular format to model.predict() i am currently working on my TE project and got stuck due to this issue where my project guide asked me to pass the custom sentence rather than predicting from test set. If you have founded any solution please do help. Thanks in advance :)