Closed andyyuan78 closed 9 years ago
Thanks, Andy. Just corrected the README.md file, and added full instructions on getting the dataset ready. I think you need to extract the ap dataset into the "input" directory. You can try follow the instructions in the README.md file and see if that helps.
Best, Ke
to make it run example , we sould: 1.use --training_iterations , not ----number_of_iterations 2.express the ap.tar.gz and copy the file to root like:
ubgpu@ubgpu:~/github/PyLDA$ tree . ├── doc.dat ├── input │ ├── ap │ └── ap.tar.gz ├── output ├── raw │ └── ap.tar.gz ├── README.md ├── src │ └── lda │ ├── hybrid.py │ ├── inferencer.py │ ├── init.py │ ├── init.pyc │ ├── launch_test.py │ ├── launch_train.py │ ├── monte_carlo.py │ └── variational_bayes.py ├── test.dat ├── train.dat └── voc.dat
6 directories, 15 files ubgpu@ubgpu:~/github/PyLDA$ cd src/ ubgpu@ubgpu:~/github/PyLDA/src$ python -m lda.launch_train --input_directory=../ input/ap --output_directory=../output/ --number_of_topics=10 --training_iterations=100 --inference_mode=0 successfully load all training docs from /home/ubgpu/github/PyLDA/train.dat... successfully load all the words from /home/ubgpu/github/PyLDA/voc.dat... ========== ========== ========== ========== ========== output_directory=../output/../150730-225540-lda-I100-S10-K10-aa0.100000-ab0.000147-im0/ input_directory=.. corpus_name=.. training_iterations=100