marcdotson / counting-cockroaches

Using social media to assess the severity of service failures.
MIT License
3 stars 0 forks source link

Clustering tweets into topic groups #9

Closed AdrielC closed 5 years ago

AdrielC commented 5 years ago

We need to find a method for assigning a topic or category to each tweet based on it's text. We can first start with static rules such as substring matching (if tweet_text contains "delay"), although something more generalizable would be better, since people often misspell and shorten words in tweets.

AdrielC commented 5 years ago

https://github.com/hundredblocks/concrete_NLP_tutorial/blob/master/NLP_notebook.ipynb I think a good way to go about this would be to use a CNN like in this article. Instead of topic assignment, we do sentence classification.

marcdotson commented 5 years ago

Agreed. Merging this into a new issue.