This follows the MemoryBackend pattern of introducing an Op type that we are planning onto. This Op in spark is basically calling a function with a SparkContext and ExecutionContext to produce a Future of an RDD.
This has the nice property that we don't take the SparkContext when we are planning, only when running.
Secondly, I filled in the other missing stuff: the SparkWriter, which manages writes as we evaluate Executions, and also the mapping of sources and sinks.
In I think 2-3 following PRs we can finish:
implement the writer
finish the planner
Note, the writer and planner implementation work can go on in parallel. So I can just fork myself and finish faster.
This follows the MemoryBackend pattern of introducing an Op type that we are planning onto. This Op in spark is basically calling a function with a SparkContext and ExecutionContext to produce a Future of an RDD.
This has the nice property that we don't take the SparkContext when we are planning, only when running.
Secondly, I filled in the other missing stuff: the SparkWriter, which manages writes as we evaluate Executions, and also the mapping of sources and sinks.
In I think 2-3 following PRs we can finish:
Note, the writer and planner implementation work can go on in parallel. So I can just fork myself and finish faster.