blei-lab / lda-c

This is a C implementation of variational EM for latent Dirichlet allocation (LDA), a topic model for text or other discrete data.
GNU Lesser General Public License v2.1
166 stars 93 forks source link

How to compute perplexity? #9

Open jagdeeppani opened 5 years ago

jagdeeppani commented 5 years ago

Hi, I am trying to estimate the perplexity on a test set (unseen set of documents). After the inference step on the test set, I see the likelihood file has large negative numbers (e.g. -1800). What are these numbers exactly. If these are log likelihood estimates, should we compute the perplexity by just taking the exponent of average of these values.

Looking forward for an answer. Thanks, Jagdeep