Closed sashafrey closed 10 years ago
Fixed by https://github.com/sashafrey/topicmod/commit/2b1cd2062a4c30b4f9a87298d5ac394b576d912f. The problem was because in the input docword file there were some tokens with 0 occurrences. BigARTM is now fixed to handle this in a robust way (ignore tokens with 0 occurrencies during parsing; for old batches, ignore tokens in Processor and in Perplexity calculation).
The script can be downloaded from here: https://drive.google.com/folderview?id=0BywMvWOrZXR3M3V1aUNseUdnZGc&usp=gmail