Sotera / watchman

Watchman: An open-source social-media event-detection system
GNU General Public License v2.0
20 stars 7 forks source link

Improved sentiment clusters #70

Open drJAGartner opened 7 years ago

drJAGartner commented 7 years ago

The current method of scoring sentiment by the sum of the whole paragraph merits more scrutiny. My hope is to create better sentiment clusters by breaking out sentences into multiple terms, with each node being a term and it's synonyms. We will also look to create likelihood scores based on the same technique employed by comedian.

lukewendling commented 7 years ago

@drJAGartner i'd be interested in reviewing with you the techniques we use to normalize the data, as we did in comedian. for instance, what overhead do we incur by making REST calls to fetch historical data. would we be better served by developing 'aggregation' endpoints that reduce the num of queries to transactional endpoints like socialmediaposts, postsclusters, etc.? maybe not, i'd just like to spend a minute on it.

drJAGartner commented 7 years ago

I'm happy to review; the bulk of this improvement is focused on the internal of the model iteself, and the back review is a bit of an afterthought.