maximtrp / bitermplus

Biterm Topic Model (BTM): modeling topics in short texts
https://bitermplus.readthedocs.io/en/stable/
MIT License
77 stars 13 forks source link

Using `biterm.perplexity()` for Calculating Perplexity of Other Topic Models #33

Closed Zay-Ben closed 1 year ago

Zay-Ben commented 1 year ago

From my understanding, biterm.perplexity() takes in three inputs: p_wz, the topics vs. words probabilities matrix (T x W); p_zd, the documents vs. topics probabilities matrix (D x T); and T, the number of topics. Those inputs are often the same output of other topic models, as well.

May I ask if it is possible to use biterm.perplexity() to calculate the perplexity by Heinrich (2005) of other topic models?

Thank you!

maximtrp commented 1 year ago

Hello! Yes, it is possible to use it with the other models. And you should be careful with the shapes of these matrices. We are using this method with tomotopy LDA model in our lab.