mimno / Mallet

MALLET is a Java-based package for statistical natural language processing, document classification, clustering, topic modeling, information extraction, and other machine learning applications to text.
971 stars 346 forks source link

Hyperparameter optimization when training Labeled LDA? #194

Open jonaschn opened 3 years ago

jonaschn commented 3 years ago

Labeled LDA does not support alpha and beta hyperparameter optimization at this time. How can it be implemented?

mimno commented 3 years ago

Good question. Since in Labeled LDA labels are equal to topics and serve as a "mask" that zeroes out certain topics, there's no single Dirichlet distribution that we could optimize. Maybe some kind of zero inflated model like https://www.birs.ca/workshops/2019/19w5128/files/slides_ZIGDM.pdf ? I'd be open to pull requests.