databricks / cli

Databricks CLI
Other
148 stars 56 forks source link

Create UC schema from DAB without development mode name prefix #1779

Open jamuska opened 2 months ago

jamuska commented 2 months ago

When deploying UC schemas with mode: development, the schema name gets the same dev prefix as workflow or pipeline names.

In our case, we are already using separate dev catalogs, so the prefixed schema name does not bring any additional value. Rather it makes the schema list more difficult to read and annoys the developers who have to write a long schema name. I would like to have the option to deploy the schema without the prefix.

shreyas-goenka commented 1 month ago

Thanks for reaching out @jamuska! DABs has the configurable presets.name_prefix that you can set to "" to turn off prefixes for all resources on your DAB.

That, however, will not solve your problem of just turning off the prefix for just UC schemas. One workaround I can recommend for now is to add a separate target to your DAB where presets.name_prefix is set to the empty string. Developers can then deploy both targets to avoid the prefix.

As for a native solution to allow selective prefixing, it's something we will have to evaluate. One of the intentions behind mode: development is to provide isolated namespacing for users, and I'm not sure whether we want to introduce yet another knob here just yet.

I'll keep this issue open to receive feedback incase other users also run into this issue.

jamuska commented 1 month ago

Thanks for the input @shreyas-goenka

I get your point about namespacing with development mode. I really like the feature for workflows and pipelines, where I'm most of the time just clicking on them in a list and there is no other native way of separating those to "collections". With UC resources the situation in my opinion is a bit different, since UC natively offers different layers for organizing objects. Including user name in each of the resources (which I suppose is required to avoid having duplicate object names for different developers) makes the object names very long and cumbersome to use in queries.

Correct me if I'm wrong, but using a separate target without name_prefix to deploy the schemas would require me to define all the shared resources separately for every other target to avoid deploying several copies of workflows and pipelines with duplicate names. Or is there some way I could exclude resources for a single target?