sloria / TextBlob

Simple, Pythonic, text processing--Sentiment analysis, part-of-speech tagging, noun phrase extraction, translation, and more.
https://textblob.readthedocs.io/
MIT License
9.09k stars 1.13k forks source link

train the classifier with words instead of sentences #140

Open jnkboy1 opened 7 years ago

jnkboy1 commented 7 years ago

Sir, with due respect, I would like to ask you that instead of training the classifier with sentences, is it possible to do so with words. Like:

train = [ ('rest, motion, length, mass, time, space, inertia, moment, momentum, impulse, torque','physics'), ('heart, liver, kidney, germ, disease, brain, stomach, leg, palpitation, cardiac, blood','physics') ]

ceased-ebc commented 7 years ago

Interested to learn also.

nakuldahiwade commented 7 years ago

I think this is similar to how sentences are currently used in the classifier. As the sentences are tokenized into words and their probability is calculated based on their occurrence in the trian_set. Hence each word has it's probability calculated separately. Something more interesting would be if we could assign prior weighted probabilities to words.