dagster-io / dagster

An orchestration platform for the development, production, and observation of data assets.
https://dagster.io
Apache License 2.0
11.14k stars 1.4k forks source link

Bring together Dask Client/Cluster implementations for the Dask executor and resource #2901

Open kinghuang opened 4 years ago

kinghuang commented 4 years ago

Summary

2811 will introduce a Dask resource for use with solids. However, it has a separate implementation of Dask client/cluster config and setup from the existing Dask executor. The two aren't equal in capabilities (the executor can't connect to an existing cluster), and this is repeated code.

Improvements

Refactor the Dask resource and the Dask executor to share a common config schema and implementation for instantiating Dask clients/clusters. Dask configs between the two should be interchangeable, and there should only be one code path for setting up Dask.

Risks

There may be some minor config schema changes.

Involved components

Links

kinghuang commented 3 years ago

I'm working on this.