Open rjagerman opened 8 years ago
First work-in-progress done in https://github.com/rjagerman/glint/tree/spark-integration. This branch actually facilitates easier testing with spark (not necessarily localhost). For example, when running within a spark shell (and/or having access to the spark context sc
), we can now run:
import glint.Client
val client = Client.runOnSpark(sc)
This will boot up a parameter server on each of the spark workers that are reachable from the spark context (sc). This should make it much easier to test stuff out, since it is no longer necessary to manually run Glint master and Glint server nodes.
Since the parameter servers run in the same JVM as spark, there is an increased risk of failure (if Spark does something bad and the JVM crashes, so does the parameter server!). Hence, this method does come at an additional risk.
In order to facilitate an easier setup for localhost testing it would be nice to spawn a glint subsystem from within an application that uses the framework. E.g.:
It should then run a master node and the specified number of parameter server nodes. We could run all of them within the same JVM process, although the overhead might be a bit much.
This will make it easier to run localhost tests without having to start a master and server in separate terminal windows. The current automated tests do something similar already.