Sotera / watchman

Watchman: An open-source social-media event-detection system
GNU General Public License v2.0
20 stars 7 forks source link

Create "Topic" service #68

Open drJAGartner opened 7 years ago

drJAGartner commented 7 years ago

When we don't eliminate high volume persistent hashtags, it actually serves as a good tool for doing daily topic modeling, which is how our events are perceived used at this time. As we move to a more granular event model, it would be good if we can reproduce this service as a once a day topic modeling service.

lukewendling commented 7 years ago

idea: create a separate non-persistent postscluster model. mongo actually has an index type called TTL index that will expire records after X seconds. good for rolling daily logs. similar to capped collections.

drJAGartner commented 7 years ago

@justinlueders - Please review branch 68-topic-model. The extra structure associated with creating topic vectors needs to be added.