mimno / Mallet

MALLET is a Java-based package for statistical natural language processing, document classification, clustering, topic modeling, information extraction, and other machine learning applications to text.
https://mimno.github.io/Mallet/
Other
971 stars 346 forks source link

Hyperparameter optimization when training Labeled LDA? #194

Open jonaschn opened 3 years ago

jonaschn commented 3 years ago

Labeled LDA does not support alpha and beta hyperparameter optimization at this time. How can it be implemented?

mimno commented 3 years ago

Good question. Since in Labeled LDA labels are equal to topics and serve as a "mask" that zeroes out certain topics, there's no single Dirichlet distribution that we could optimize. Maybe some kind of zero inflated model like https://www.birs.ca/workshops/2019/19w5128/files/slides_ZIGDM.pdf ? I'd be open to pull requests.