Closed sidharth-shridhar closed 9 months ago
@sidharth-shridhar are you refering to these (https://docs.databricks.com/api/workspace/clusters/create):
data_security_mode string Enum: "NONE" "SINGLE_USER" "USER_ISOLATION" "LEGACY_TABLE_ACL" "LEGACY_PASSTHROUGH" "LEGACY_SINGLE_USER" Data security mode decides what data governance model to use when accessing data from a cluster.
NONE: No security isolation for multiple users sharing the cluster. Data governance features are not available in this mode. SINGLE_USER: A secure cluster that can only be exclusively used by a single user specified in single_user_name. Most programming languages, cluster features and data governance features are available in this mode. USER_ISOLATION: A secure cluster that can be shared by multiple users. Cluster users are fully isolated so that they cannot see each other's data and credentials. Most data governance features are supported in this mode. But programming languages and cluster features might be limited. LEGACY_TABLE_ACL: This mode is for users migrating from legacy Table ACL clusters. LEGACY_PASSTHROUGH: This mode is for users migrating from legacy Passthrough on high concurrency clusters. LEGACY_SINGLE_USER: This mode is for users migrating from legacy Passthrough on standard clusters.
@stikkireddy Yes, indeed. For certain workspaces that are not unit-catalog enabled, the job cluster data_security_mode should be set to "LEGACY_SINGLE_USER_STANDARD" in order to save datasets to hive_metastore. ref: issue
Currently, https://github.com/Nike-Inc/brickflow/blob/main/brickflow/bundles/model.py has type check enabled to only 'SINGLE_USER', 'USER_ISOLATION', 'NONE'
Can we add others including LEGACY_SINGLE_USER_STANDARD
Describe the bug Unable to set property
data_security_mode
toLEGACY_SINGLE_USER_STANDARD
while creating a new cluster. Getting the following error:This policy is required to be set for legacy workspaces, otherwise one is not able to save the datasets to hive_metastore.
To Reproduce Steps to reproduce the behavior:
Create a standard workflow using brickflow
define a job_cluster as below:
4.Try to deploy the workflow to the desired databricks WS:
bf projects synth --project <project_name>
See error:
Expected behavior One should be able to deploy workflows without any issues with cluster having data_security_mode=LEGACY_SINGLE_USER_STANDARD.
Cloud Information
Desktop (please complete the following information):