dongwookim-ml / python-topic-model

Implementation of various topic models
Apache License 2.0
369 stars 172 forks source link

In the CTM code, what are the different attributes? #5

Closed mbahacker closed 8 years ago

mbahacker commented 8 years ago

doc_cnt : structure and what is it? doc_ids: is it a 1-D array of all the doc ids? n_voca?

n_topics: I believe items are the actual documents. So, if I have 1000 documents, n_topics will be equal to 1000

dongwookim-ml commented 8 years ago

doc_cnt and doc_ids have the same structure of online lda code by Matthew D. Hoffman (corresponds to wordcts, wordids in his code): https://github.com/blei-lab/onlineldavb/blob/master/onlineldavb.py n_voca is the size of the vocabulary. n_topics is the number of topics generally used in topic models.