Changing Topic Distribution

Hi,

The underlying reason is the random process involved in Gibbs sampling, which generates the word-topic assignments at each iteration.

In order to ensure the same results from each run you will need to also freeze the seed. This can be done by using the --seed flag, which takes an integer as a parameter. If you use the same seed, each run should have the same run. For example, all with the -q quiet flag for automating:

topicexplorer init corpus -q
topicexplorer prep corpus --high-percent 30 --low-percent 20 -q
topicexplorer train corpus --seed 894721 --iter 200 -k 20 40 60 -q

The variance in topic models remains an open area of research, but I can dig up a few relevant papers on request. First, I wanted to make sure an answer on the inconsistency in the topicexplorer was addressed!

inpho / topic-explorer

Changing Topic Distribution #321