joewandy / hlda

Gibbs sampler for the Hierarchical Latent Dirichlet Allocation topic model
GNU General Public License v3.0
147 stars 38 forks source link
gibbs-sampler hierarchical-topic-models lda topic-hierarchies topic-modeling

Hierarchical Latent Dirichlet Allocation

Note: this repository should only be used for education purpose. For production use, I'd recommend using https://github.com/bab2min/tomotopy which is more production-ready


Hierarchical Latent Dirichlet Allocation (hLDA) addresses the problem of learning topic hierarchies from data. The model relies on a non-parametric prior called the nested Chinese restaurant process, which allows for arbitrarily large branching factors and readily accommodates growing data collections. The hLDA model combines this prior with a likelihood that is based on a hierarchical variant of latent Dirichlet allocation.

Hierarchical Topic Models and the Nested Chinese Restaurant Process

The Nested Chinese Restaurant Process and Bayesian Nonparametric Inference of Topic Hierarchies

Implementation

Installation