dagster-io / dagster

An orchestration platform for the development, production, and observation of data assets.
https://dagster.io
Apache License 2.0
11.53k stars 1.45k forks source link

PySpark StepLauncher for GCP Dataproc #1820

Open natekupp opened 5 years ago

natekupp commented 5 years ago

Right now IIRC EMR, Dataproc are factory solids—we probably want these to be provisioned as resources so that you can seamlessly move workloads from local to remote Spark clusters

natekupp commented 4 years ago

cc @sryza - at some point we should revisit the GCP Dataproc stuff to use the new step launcher machinery

sryza commented 4 years ago

Renamed this issue to make it more specific to the remaining work

sryza commented 4 years ago

Probably makes sense to have an issue per platform, e.g. a separate issue if we want Azure HDInsight integration

natekupp commented 4 years ago

cool, sounds good on all of the above!

wonjae-2352 commented 1 year ago

Wonder if there is any update one this request?