BIDS-projects / topic-modeling

Categorization of various data science institutions into several different topics
Apache License 2.0
1 stars 0 forks source link

Too many duplicate topics #29

Open chewisinho opened 8 years ago

chewisinho commented 8 years ago

We're at the point now where there are too many duplicate topic words.

I played around with monogram/bigram (1-2) - the results weren't bad. I think trigrams are a bad idea right now because they encourage duplicates even more.

Thoughts on how to fix?