bwbaugh / infertweet

Infer information from Tweets. Useful for human-centered computing tasks, such as sentiment analysis, location prediction, authorship profiling and more!
http://infertweet.bwbaugh.com/
Other
10 stars 1 forks source link

Use counts of the features' conditional probabilities #12

Open bwbaugh opened 11 years ago

bwbaugh commented 11 years ago

Instead of letting the Naive Bayes classifier generate a score by multiplying the conditional probabilities together, I read somewhere (I can't remember at the moment) that they count the number of features (words) that are highly confident for a particular class. For example:

8 highly positive features minus 3 highly negative features, the result is +3, so the label is positive.

If the value is within some range around 0, then it could be labeled as neutral. Could also consider moderately confident values (or even low confidence), but in any case would need to determine the thresholds for pigeonholing the conditional probabilities into confidence levels.

Idea was originally recorded 3/12/13 at 23:01.