Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.
For evaluating and understanding a LDA topic model, phi and theta are essential. The documentation really should
provide information on how these can be aquired.
In addition to say something about how phi and theta can be aquired, or approximated, the documentation about what the numbers in the human readable model represents can be improved, for now it says "columns 2-n represent the per-word topic distributions". I think be to useful these numbers need to be normalised, I guess so that each row sums to 1, in which case the number represents p(t|w), ie. the probability of the topic, when the word is given.
Another statistic one need to understand a topic model is p(w|t), ie the probability of the term given the topic. I believe this is phi.
Description
For evaluating and understanding a LDA topic model, phi and theta are essential. The documentation really should provide information on how these can be aquired.
Link to Documentation Page
https://github.com/VowpalWabbit/vowpal_wabbit/wiki/Latent-Dirichlet-Allocation
In addition to say something about how phi and theta can be aquired, or approximated, the documentation about what the numbers in the human readable model represents can be improved, for now it says "columns 2-n represent the per-word topic distributions". I think be to useful these numbers need to be normalised, I guess so that each row sums to 1, in which case the number represents p(t|w), ie. the probability of the topic, when the word is given.
Another statistic one need to understand a topic model is p(w|t), ie the probability of the term given the topic. I believe this is phi.
https://stackoverflow.com/questions/65727712/can-ldavis-analyse-the-results-of-vowpal-wabbit-lda is essentially about this problem too.