Open srimalj opened 8 years ago
Dear Srimal, the topic proportions can be easily computed by running the e-step function once the model is trained:
unseen_docs = create_doc_count_lists(unseen_data)
gamma, stats = ldainstance.e_step(unseen_docs) theta = gamma.T/gamma.sum(1)
theta is in this case a K x D matrix (where K is the number of topics and D the unseen documents).
Hope this helps. Mirwaes
Dear Srimal, the topic proportions can be easily computed by running the e-step function once the model is trained:
get list of word ids and counts
unseen_docs = create_doc_count_lists(unseen_data)
compute the topic proportions
gamma, stats = ldainstance.e_step(unseen_docs) theta = gamma.T/gamma.sum(1)
theta is in this case a K x D matrix (where K is the number of topics and D the unseen documents).
Hope this helps. Mirwaes
Thank you.
Hi Mirwaes
I’m using the python code in github: scLDA - Fast variational Bayes inference for Latent Dirichlet Allocation
I am fairly new to topic models and am trying to figure out what method / attributes I could use to get the topic proportions for a given document x (say for a new unseen document) once the LDA model is trained?
Basically I would like to do something similar to the transform() method in the scikit implementation at http://scikit-learn.org/dev/modules/generated/sklearn.decomposition.LatentDirichletAllocation.html#sklearn.decomposition.LatentDirichletAllocation.transform
Any pointers would be much appreciated.
Thanks.
Srimal.