microsoft / LightLDA

Scalable, fast, and lightweight system for large-scale topic modeling
http://www.dmtk.io
MIT License
842 stars 235 forks source link

Very Big dataset, Bad Alloc caught: failed memory allocation for documents_buffer in DataBlock #73

Open danyang-liu opened 6 years ago

danyang-liu commented 6 years ago

I have a very big dataset. When I excute: bin/lightlda -num_vocabs 200000 -num_topics 2000 -num_iterations 100 -alpha 0.1 -beta 0.01 -mh steps 2 -num_local_workers 6 -num_blocks 1 -max_num_document 15000000 -input_dir /mnt/data -data_capacity 192000.

I have this error: Bad Alloc caught: failed memory allocation for documents_buffer in DataBlock.

I think it is because my server don't have enough memory, how can I deal with it?