kzhai / InfVocLDA

Online Latent Dirichlet Allocation with Infinite Vocabulary using Variational Inference
https://github.com/kzhai/InfVocLDA
Apache License 2.0
74 stars 19 forks source link

AttributeError: 'FreqDist' object has no attribute 'inc' #1

Closed andyyuan78 closed 9 years ago

andyyuan78 commented 9 years ago

ubgpu@ubgpu:~/github/InfVocLDA/src$ python -m fixvoc.launch --input_directory=../input/ --output_directory=../output/ --corpus_name=20-news --number_of_topics=10 --number_of_documents=18600 --batch_size=100 successfully load all training documents... successfully load all the words from ../input/20-news/voc.dat... ========== ========== ========== ========== ========== output_directory=../output/20-news/15Jun17-223315-fixvoc-D18600-K10-I10-B100-O186-t64-k0.6-at0.1-ae1.22546e-05-False-False/ input_directory=../input/20-news corpus_name=20-news dictionary_file=../input/20-news/voc.dat number_of_documents=18600 number_of_topics=10 snapshot_interval=10 batch_size=100 online_iterations=186 tau=64.0 kappa=0.6 alpha_theta=0.1 alpha_eta=1.22546016029e-05 hybrid_mode=False hash_oov_words=False ========== ========== ========== ========== ========== Traceback (most recent call last): File "/usr/lib/python2.7/runpy.py", line 162, in _run_module_as_main "main", fname, loader, pkg_name) File "/usr/lib/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/home/ubgpu/github/InfVocLDA/src/fixvoc/launch.py", line 222, in main() File "/home/ubgpu/github/InfVocLDA/src/fixvoc/launch.py", line 189, in main olda.export_beta(os.path.join(output_directory, 'exp_beta-0'), 50); File "fixvoc/inferencer.py", line 89, in export_beta freqdist.inc(word, self._exp_E_log_beta[k, self._vocab[word]]); AttributeError: 'FreqDist' object has no attribute 'inc' ubgpu@ubgpu:~/github/InfVocLDA/src$

kzhai commented 9 years ago

Hi, Andy,

The model was originally implemented with NLTK 2.x, ever since NLTK 3.0, they changed the interface of freqdist. For quick fix, you may either downgrade NLTK to 2.x version, or go through the code and change all command of "freqdist.inc(sample, count)" to "freqdist[sample] += count".

If you do the following, please create a pull request, and I will merge them in.

Best, Ke

andyyuan78 commented 9 years ago

it works.I will PR later

andyyuan78 commented 9 years ago

To save the boring fork and pull process, I listed the diff here, only two lines:

ubgpu@ubgpu:~/github/InfVocLDA$ git diff diff --git a/src/fixvoc/inferencer.py b/src/fixvoc/inferencer.py index 2f39cae..4c91052 100755 --- a/src/fixvoc/inferencer.py +++ b/src/fixvoc/inferencer.py @@ -86,11 +86,11 @@ class Inferencer: freqdist.clear();

         for word in self._vocab.keys():
kzhai commented 9 years ago

I think I have changed every inc() method to be compatible with nltk 3.x. Please let me know if you still run into any problem.

zcc973784075 commented 8 years ago

Thanks a lot

gauravkoradiya commented 5 years ago

Problem resolved.