douban / paracel

Distributed training framework with parameter server
http://paracel.io
Other
337 stars 84 forks source link

how does the paracel solve the 'last-reducer' problem of iterative tasks? #55

Open tuhangdi opened 7 years ago

tuhangdi commented 7 years ago

could you tell me the position of code in paracel about this problem

xunzhang commented 7 years ago

Paracel uses SSP to solve this kind of problem, follow the logistic regression example section in this tutorial to mimic a simple SSP example.

The main entry to set up ssp_switch is here. You can follow above guide and trace the call stack or watch the invoke process to see its implementation. There should be lots of places.

BTW, SSP idea is a generalization of BSP. But currently, SSP idea only be proved convergent in the convex optimization problem. For non-convex optimization function like deep neural networks, it only applies to downpour SGD or other SGD case. It is not convergent in mini-batch bp according to my experiment. In other words, you need tuning the limit_s parameter to verify convergence.