echohenry2006 / psvm

Automatically exported from code.google.com/p/psvm
0 stars 0 forks source link

Speedup Issues and Large Problems #2

Open GoogleCodeExporter opened 8 years ago

GoogleCodeExporter commented 8 years ago
Hello,
   I am noticing that the speedup is very bad as the number of processors
increases beyond 4. It is very easy to see what the problem is. The serial
Cholesky Factorization (Not the PICF) used in the IPM will always take the
same amount of time for a fixed problem size independent of the number of
processors. Thus  If this were done in parallel, the speedup might be improved.
Here are the results I got for a cluster of 8-core Xeon Nodes using an
infiniband network.
 CPUS   TIME    TRAIN   IO  ICF IPM B EST.  OUT EFF   SpdUp
2   16.507  13.684  1.547   3.529   3.344   6.810   1.277   1.000 2.000
4   13.955  12.398  0.755   3.197   2.749   6.451   0.803   0.591 2.366
8   12.015  10.929  0.515   2.466   2.490   5.974   0.570   0.343 2.748
16  6.425   5.437   0.529   1.137   2.356   1.944   0.459   0.321 5.138
24  5.082   4.181   0.163   0.512   2.396   1.272   0.739   0.271 6.496
32  5.684   4.269   1.050   0.968   2.307   0.994   0.365   0.182 5.808

Also , there seems to be some issues if the input file get too large..
The program never get through the PICF.

Thanks, 
  Patrick Nichols

Original issue reported on code.google.com by PatJNichols@gmail.com on 5 Aug 2008 at 5:44

GoogleCodeExporter commented 8 years ago
Hi Patrick,

What is your dataset size and what is your parameter for -rank_ratio? 
For now, Cholesky Factorization is serial because it usually works on
a smaller matrix. During our experiment on RCV 800k dataset, we set rank_ratio
to 0.001 so that the matrix CF works on is 800*800. I suspect you set 
rank_ratio to 
a large value which may cause bad speedup. You could decrease rank_ratio and 
try.

In fact, we used to consider Parallel Cholesky Factorization, but it will be
even slower on distributed computers because it requires much communication.
For most problems, the matrix CF works on is set to be small through rank_ratio.

Original comment by baihong...@gmail.com on 19 Aug 2008 at 8:22

GoogleCodeExporter commented 8 years ago
Thanks,
   That seems to drastically help the speedup. One quick question...I noticed that
the resulting treshold/bias for my training data set seems to change with 
different
rank_ratio parameters. My naive impulse is to assume that this is bad. Is this 
true?

Pat

Original comment by PatJNichols@gmail.com on 23 Aug 2008 at 10:34

GoogleCodeExporter commented 8 years ago
Because for Interior Point Method, we have to do approximation to make it 
solvable.
-rank_ratio is to control this approximation. Generally, the larger the 
rank_ratio
is, the better the result is. But we have to trade off between time and 
accuracy.
Make #number_of_data * #rank_ratio =1000 will be generally enough.

Original comment by baihong...@gmail.com on 24 Aug 2008 at 7:31