Closed weixliu closed 7 years ago
Should I split data to six different part and use dump_binary generating six different file, then put them into different machine? Or I just put same data to different machine?
See the issue down!
yes, I see the same error [Invalid topic assignment 148893078 from word proposal], but I just want to know how to set data in distribute mode. If you know sth, I wish you can give some info.
split libsvm data into blocks and use dump_binary to make block.0 , vocab.0 , vocab.0.txt using each block, then copy them to each node. Please see #38 for More information .
thanks for the information, it's very useful. I have to split data to different machines first.
I just using LightLDA example in distribute mode, then command is below:
mpiexec -machinefile $root/machine.list $bin/lightlda -num_vocabs 111400 -num_topics 1000 -num_iterations 100 -alpha 0.1 -beta 0.01 -mh_steps 2 -num_servers 6 -num_local_workers 1 -num_blocks 1 -max_num_document 300000 -input_dir $dir -data_capacity 800
I add -machinefile params, -num_servers params, all the other params are same with nytimes.sh. When I exec the comman I got below log(error) and I don't know why.I just copy the same data to 6 machines at same position;but nytimes.sh can be exec in single machine, so I want to know how to use LightLDA in distribute mode? thanks~