How to run jobs on an existing databricks cluster?

databrickslabs / cicd-templates

Manage your Databricks deployments and CI with code.

Other

202 stars 100 forks source link

How to run jobs on an existing databricks cluster? #10

Closed judithliatsf closed 4 years ago

judithliatsf commented 4 years ago

In the demo workflow, it seems a new databricks cluster gets created. How to appoint the job to run on an existing databricks cluster? Should I change the dev_cicd_pipeline.py in order to specify cluster id?

Btw, it seems the source code ofdev_cicd_pipeline.py is not available, although it is listed in README.md.

Deployment │ ├── init.py │ ├── deployment.py │ ├── dev_cicd_pipeline.py │ └── release_cicd_pipeline.py

mshtelma commented 4 years ago

Why would you like to use an existing cluster? Would you like to use an existing cluster to test your jobs from IDE? In this case, you can use run_now.py: ./run_now.py pipelines pipeline1 databricks_cluster_id This command will run pipeline1 from the pipelines folder on Databricks cluster with ID databricks_cluster_id.

It's not possible at the moment to run a CI/CD job using an existing cluster. I'd like to understand more about your use case and why it's important to run a CI/CD job using an existing cluster.