vi3k6i5 / GuidedLDA

semi supervised guided topic model with custom guidedLDA
Mozilla Public License 2.0
497 stars 109 forks source link

Am I right we can't seed multiple topics with same word? #18

Open MaxKHK opened 6 years ago

MaxKHK commented 6 years ago

Hi!

Am I right that this implementation not allows to seed several topics with same word? Is there any particular reason for this limitation from theoretical standpoint? If not, could you please include this functionality in one of future releases.

To justify, imagine you have a document corpus with a set of topics and three of them are 'Aircraft equipment' and another is 'Military equipment', while final one is 'Aircraft models'. You want to ensure that first two topics get boost from word 'equipment' in text, to ensure they are not confused with something else - that topic one is not confused with topic 3.

lhorvath commented 5 years ago

Any updates on this? We were wondering about the same. In practice, the model goes through, but it might be that it's happening despite the repetition of anchor words.