Open iostreamdoth opened 3 years ago
Thanks for opening your first issue here! Be sure to follow the issue template!
Feel free to make a PR!
config generator
DataprocCreateClusterOperatorClusterGenerator class is a deprecated and we are not adding new features to it.
I think we can extend DataprocCreateClusterOperator
to accept the file path in cluster_config
parameter like Cloud Build operator.
https://github.com/apache/airflow/blob/d268016a7a6ff4b65079f1dea080ead02aea99bb/airflow/providers/google/cloud/operators/cloud_build.py#L172-L175
you mean airflow.providers.google.cloud.operators.dataproc.DataprocCreateClusterOperator
or you mean airflow.contrib.operators.dataproc_operator.DataprocClusterCreateOperator
? I thought this is going to be in providers package.
@mik-laj
@iostreamdoth ClusterGenerator from providers is legacy/deprecated method to generate dataproc cluster configuration. You should use cluster_config
parameter. For details, see: https://github.com/apache/airflow/blob/026ffe65d4738674512f691a56b922e82d0a2309/airflow/providers/google/cloud/operators/dataproc.py#L553-L559
Got it, I got confused with the Operator name earlier.
Thanks @mik-laj
@iostreamdoth do you plan to finish the stale PR?
Description
Add cluster config generation from yaml file which is generated using
gcloud dataproc clusters export
command.Use case / motivation
We are generating quite a lot of clusters using a cluster config which keeps on changing depending upon project to project. I wrote custom operator which reads yaml config and converts it to dict and then uses the same config to genreate cluster. Wondering if it will be useful if cluster config generator operator can accept a yaml config or can create a new operator.
Are you willing to submit a PR?
I do have a working code for converting yaml stored on gcs to dict.
Related Issues
There is no issue as such, it is a functionality which allows users to specify cluster config as yaml file.