Closed wanggaohang closed 8 years ago
@wanggaohang Sorry for the ambiguous output file name. The dump model is what stores in parameter server. Table 0 is word-topic-table. And table 1 is summary row, which is a [# of topics]-dimension vector containing the occurrence count of each topics.
For doc-topic distribution, sorry the current version didn't provide this. Since we used to train lda model to get the word-topic table, and then use the model to inference other documents for some applications.
It would be easy to output doc-topic distribution, if users feel it's useful. Contributions are also warmly welcomed.
Indeed, I think output both doc-topic distribution and topic-word distribution would be helpful.
@wanggaohang @boche
Now lightlda can dump doc-topic statistics when finish training. Thanks.
@feiga for word-topic-table,if the value is bigger, it means that the word has a bigger weight to a topic??
I see the result is server_0_table_0.model and server_0_table_1.model, server_0_table_0.model is the distributed of topic by terms. but server_0_table_1.model only has one line. could i get the distributed of topic by every docid?