yahoo / CaffeOnSpark

Distributed deep learning on Hadoop and Spark clusters.
Apache License 2.0
1.27k stars 358 forks source link

On the timing problem of parameter synchronization. #274

Open jacklonghui opened 7 years ago

jacklonghui commented 7 years ago

@junshi15 Hi,there is a problem. Please help me. The current iteration ends, and then the weight is updated for the next iteration. When updating the weights, each node computes the related parameters, sends them to other nodes at once, or waits for the other nodes to complete the calculations before sending and receiving them together? Then update the weights until each node is updated and then proceed to the next iteration.

junshi15 commented 7 years ago

Gradients are sent once available, however all the nodes wait for updated weights before proceeding to the next iteration.