gregversteeg / corex_topic

Hierarchical unsupervised and semi-supervised topic models for sparse count data with CorEx
Apache License 2.0
627 stars 120 forks source link

Providing exclusion words for topics #52

Open rogerci91 opened 2 years ago

rogerci91 commented 2 years ago

Hi first of all I wanted to say thank you for the great work! I very recently came across CorEx and found the solution is much better than traditional topic modelling solutions like LDA, especially the anchor words which enables to produce more meaningful and reasonable topics based on domain knowledge. Now I'm thinking of the opposite, say based on domain knowledge I know certain words shouldn't belong to certain topic, it would be good to provide some so-called "exclusion words" on top of the anchor words for the topic. Do you think this could be some feature added in the model? Thank you!