rjagerman / glint

Glint: High performance scala parameter server
MIT License
168 stars 67 forks source link

Can glint support SSP mode #47

Closed cstur4 closed 8 years ago

cstur4 commented 8 years ago

In general, SSP mode gives a better performance. Will glint support SSP mode? Thanks.

rjagerman commented 8 years ago

There are no plans to include this in the immediate future. To limit the scope of the project, we want to focus solely on asynchronously storing distributed parameters. I should definitely clarify this in the documentation to prevent misunderstanding.

It is always possible to use Glint's asynchronous methods and impose blocks or waits to make it behave synchronously or bounded-synchronously (SSP). However, this is actually quite difficult and without taking proper care, such blocks can deadlock the scala execution context...

In general, if the dataset RDD is partitioned sufficiently well, Spark should already load balance the tasks based on data locality (and use a fallback timeout mechanism in case of stragglers). I have found that Spark's own methods for dealing with stragglers work surprisingly well.

cstur4 commented 8 years ago

Can I add iteration information in pull message, servers record the latest iteration number, and send stop or start signal back to workers?

rjagerman commented 8 years ago

It is possible but not easy. It would require rewriting some of the core code base of Glint. In particular, you'd have to change the information that is send over the network, which requires modifying the very low level serialization routines. Recording iteration numbers and sending start/stop signals would be completely new functionality that would need implementing.

I am not planning on doing that soon, but if you wish to try that, I'd be happy to help with any questions or problems that arise.