databricks / bundle-examples

Examples of Databricks Asset Bundles
Other
57 stars 23 forks source link

Setting policy_name/id in conf #31

Open tapajyotiD opened 4 weeks ago

tapajyotiD commented 4 weeks ago

I am trying to set policy in job.yml file. I guess bydefault the policy is unrestricted. As a result, I get unauthorized error and the pipeline fails. I have to disconnect from source in databricks UI and change the policy to make it work. So, what is the procedure to set the cluster policy

custom:
  basic-cluster-props: &basic-cluster-props
    spark_version: "14.3.x-scala2.12"
    spark_conf:
      spark.databricks.delta.preview.enabled: "true"
      spark.driver.extraJavaOptions: "-Dlog4j2.formatMsgNoLookups=true"
      spark.executor.extraJavaOptions: "-Dlog4j2.formatMsgNoLookups=true"

    init_scripts:
      - workspace:
          destination: "/init_script.sh"

    autoscale:
        min_workers: 2
        max_workers: 4

  basic-static-cluster: &basic-static-cluster
    new_cluster:
      <<: *basic-cluster-props
      node_type_id: "Standard_L8s"
      policy_name: "DefaultAllPurpose"

resources:
  jobs:
    dab_job:
      name: dab_job

      email_notifications:
        on_failure:
          - abc@corp.com

      permissions:
        - user_name: "abc@corp.com"
          level: "IS OWNER"
        - user_name: "xyz@corp.net"
          level: "CAN_MANAGE"
        - user_name: "asf@yahoo.net"
          level: "CAN_MANAGE"

      job_clusters:
        - job_cluster_key: "default"
          <<: *basic-static-cluster

      tasks:
        - task_key: notebook_task
          job_cluster_key: "default"
          notebook_task:
            notebook_path: ../src/notebook.ipynb

        - task_key: main_task
          depends_on:
            - task_key: notebook_task

          job_cluster_key: "default"
          python_wheel_task:
            package_name: dab
            entry_point: main
          libraries:
            # By default we just include the .whl file generated for the dab package.
            # See https://docs.databricks.com/dev-tools/bundles/library-dependencies.html
            # for more information on how to add other libraries.
            - whl: ../dist/*.whl

I tried changing the keyword from policy_name to policy_id as well and its value as: "cluster-policy://DefaultAllPurpose" but that also throws error : cannot update job: 'cluster_policy://DefaultAllPurpose' is not a valid cluster policy ID.