databricks / cli

Databricks CLI
Other
132 stars 50 forks source link

[Feature] Destroying jobs, pipelines etc. part of a bundle #1661

Open pernilak opened 1 month ago

pernilak commented 1 month ago

With the two commands:

There is still no commend for destroying specific jobs, pipelines, etc within a bundle.

Suggestion

Having a commend like:

databricks bundle deploy --clean-up

Which gives a list over jobs to delete and asks for approval. Possible with an --auto-approve step for allowing it to be incorporated in a deployment step. That way it will be possible to remove jobs, pipelines etc. no longer referenced in the bundle.

andrewnester commented 1 month ago

Hi! Thanks for the feature request

That way it will be possible to remove jobs, pipelines etc. no longer referenced in the bundle.

If any resource (jobs, pipelines etc) removed from bundle config, next deploy is going to remove them from workspace

There is still no commend for destroying specific jobs, pipelines, etc within a bundle.

Not sure if that's something we want to support. The reason is that it contradicts the core idea of bundles: bundle configuration always match whatever is deployed in workspace. If some resources are removed selectively bundle config and resources deployed won't match anymore.

If for whatever reason you need to remove specific jobs, you can just use CLI commands such as databricks jobs delete ...

@pietern what's your thoughts on this?

dinjazelena commented 1 month ago

I mean what is the point of having option to destroy specific job but not having option to deploy specific job. And whole idea behind is matches what ever is deployed in workspace is so limited and makes no sense. As Databricks always write MLOps is about ASYNC deployment of models, data and code. You have jobs for all of them, and u dont do them in sync.

pietern commented 1 month ago

@pernilak Can you comment if @andrewnester 's explanation clarifies this?

The CLI does not print a message that resources are deleted when they are removed from the configuration. We should change this to make it more clear what is happening under the hood. As @andrewnester comments, every time you run deploy the command will make the actual state match the configuration, and that can involve creating, updating, and deleting individual resources.

pietern commented 1 month ago

@dinjazelena Can you clarify what you mean? I don't think I understand.

A bundle involves code and resources. Running deploy will always update the code. If you have 2 jobs referring to this code, and let's say you make a change in expected parameters. Then deploying only job 1 and not job 2 could even break job 2 if it passes stale parameters to the code.

At best it would be a performance improvement, at worst it could lead to unexpected behavior.