When LightLDA dumps a binary file with a .dict obtained from partial corpus, the parameter word_num will be relatively smaller than the true maximum of wordID in the corpus. As a result, dump_binary won't write those words whose id bigger than word_num into the output file. Ignored words are probably regarded as topic 0, causing issue56
I changed the following codes in my local environment, it works fine and solved the issue.
When LightLDA dumps a binary file with a .dict obtained from partial corpus, the parameter word_num will be relatively smaller than the true maximum of wordID in the corpus. As a result, dump_binary won't write those words whose id bigger than word_num into the output file. Ignored words are probably regarded as topic 0, causing issue56
I changed the following codes in my local environment, it works fine and solved the issue.