zhangyuc / splash

Splash Project for parallel stochastic learning
94 stars 21 forks source link

Gradient class support mini-batch? #2

Open bobye opened 9 years ago

bobye commented 9 years ago

If I understand correctly by reading the report (http://arxiv.org/pdf/1506.07552v1.pdf), the new algorithm is not a batch-wise one. The philosophy behind is that batch-wise approach makes less progress compared to the full sequential update.

Yet from an implementation perspective, processing batch can be faster than processing points one by one at the same size (because of dense matrix multiplication). I guess it at least provides some room to speed up the optimization procedure.

I am not sure adding an extra interface to support mini-batch can be beneficial for further speed-ups?

Jianbo

zhangyuc commented 9 years ago

It is our intention to implement a native batch processing API. But before that is available, you may do it yourself -- make every single RDD element as a mini-batch of samples, so that in each iteration the processing function is fed with a mini-batch instead of a single sample.

bobye commented 9 years ago

Thanks for your reply!