koheiw / seededlda

LDA for semisupervised topic modeling
https://koheiw.github.io/seededlda/
73 stars 15 forks source link

Argument uniform #64

Closed Jenny060399 closed 1 year ago

Jenny060399 commented 1 year ago

Dear Mr. Watanabe,

Could you please explain to me what the meaning of the argument "uniform" is? The documentation says that it should be set to false to make the total amount of seed words in all topics the same. Does this mean that in this case the weight of the seed words is determined independently of their term frequency in the corpus (i.e. unlike the formula in your paper)? Or in this case, are the seed words in each seed topic given the same weight, i.e. the same pseudo-counts are added for each topic? But then, the seed words would no longer help to distinguish the different Seed Topics, right?

I would be very grateful for an answer! Many thanks in advance!

koheiw commented 1 year ago

Please do not use the argument because I am planing to remove it in the next version. I added it for a research purpose but it is over-complicating seed words weighting. I admit that it should have been removed the argument before v1.0!

koheiw commented 1 year ago

uniform is removed in #65