Closed SrijanSriv211 closed 8 months ago
I would like to know how much large is your dataset. Maybe you placed input patterns incorrectly like
{"patterns": ["Hi," , "how", "are", "you"]}
Instead patterns should be in sentence like responses
{"patterns": ["Hi,how are you!"]}
Tokenization should itself do the rest like segmenting sentences of patterns into words
I would like to know how much large is your dataset. Maybe you placed input patterns incorrectly like
{"patterns": ["Hi," , "how", "are", "you"]}
Instead patterns should be in sentence like responses
{"patterns": ["Hi,how are you!"]}
Tokenization should itself do the rest like segmenting sentences of patterns into words
Large in the sense that each string in the list has like 15-30 characters along with 10-15 strings each in the "patterns" list along with around 30-35 patterns. No matter how much I tweak the hyperparameters I couldn't get better performance.
Also as I'm replying to your query like 2 weeks later, the update is that I fixed this issue by replacing the simple feed-forward network with a simple RNN.
Thank you for your response BTW 👍
I am building a simple text classification system which could take in input then classify it with a tag then output the response as a command which will be executed. This is more like a text-to-automation system. The problem is I have a very big dataset json file and no matter how much I play with the hyper-parameters and how long I train the model, the model never classifies the test sentences properly.