Open drJAGartner opened 7 years ago
idea: create a separate non-persistent postscluster model. mongo actually has an index type called TTL index that will expire records after X seconds. good for rolling daily logs. similar to capped collections.
@justinlueders - Please review branch 68-topic-model. The extra structure associated with creating topic vectors needs to be added.
When we don't eliminate high volume persistent hashtags, it actually serves as a good tool for doing daily topic modeling, which is how our events are perceived used at this time. As we move to a more granular event model, it would be good if we can reproduce this service as a once a day topic modeling service.