uvacw / inca

24 stars 6 forks source link

Create analysis to find highest tf-idf scoring terms per time span #401

Closed damian0604 closed 6 years ago

damian0604 commented 6 years ago

Similar to what's already present in the hype-detector branch, but more bottom-up: instead of specifiying a search term, find the highest scoring terms per time span. For instance, concatenate texts per time span and treat them as documents (labeled by their date).

see also https://stackoverflow.com/questions/34232190/scikit-learn-tfidfvectorizer-how-to-get-top-n-terms-with-highest-tf-idf-score

mariekevh commented 6 years ago

Created in PR #418