Nike-Inc / brickflow

Pythonic Programming Framework to orchestrate jobs in Databricks Workflow
https://engineering.nike.com/brickflow/
Apache License 2.0
183 stars 36 forks source link

[BUG] Missing common task parameters when task type is Notebook #99

Open sidharth-shridhar opened 5 months ago

sidharth-shridhar commented 5 months ago

Describe the bug Notebook task type doesn't get common task parameters injected even when they are defined under workflow config.

Package Info

To Reproduce Steps to reproduce the behavior:

  1. Create a workflow as per the following configuration:
    
    from datetime import timedelta

from brickflow import ( ctx, Cluster, BrickflowTriggerRule, TaskSettings, EmailNotifications, Workflow, WorkflowPermissions, User, NotebookTask, )

from brickflow.engine.task import PypiTaskLibrary

wf = Workflow( "brickflow-demo",

replace with your cluster id

default_cluster=Cluster.from_existing_cluster("<all-purpose-cluster-id>"),
tags={
    "product_id": "brickflow_demo",
    "slack_channel": "YOUR_SLACK_CHANNEL",
},
# COMMON TASK PARAMS
common_task_parameters={
    "catalog": "some_val",
    "database": "some_db",
},
# replace <emails> with existing users' email on databricks
permissions=WorkflowPermissions(
    can_manage_run=[User("abc@gmail.com"), User("xyz@gmail.com")],
    can_view=[User("def@gmail.com")],
    can_manage=[User("ghi@gmail.com")],
),
libraries=[PypiTaskLibrary(package="snowflake==0.5.1")],

)

@wf.notebook_task def example_notebook(): return NotebookTask( notebook_path="notebooks/example_notebook.py" )


3. Configure the necessary installation of the packages like brickflow (=0.11.2), databricks-cli
4. Initialize the project setup as guided in the [quick start guide](https://engineering.nike.com/brickflow/v0.11.2/bundles-quickstart/)
5. Try to deploy the workflow to the desired workspace using:
    `brickflow projects deploy --project brickflow-demo -e test`
6. From Databricks UI jobs config page, look into the task parameters of the task, common task parameters will be absent or missing.

**Expected behavior**
If common params are defined under the workflow object, the notebooks tasks should also be injected with the common parameters along with base parameters if supplied.

**Screenshots**

Missing params: **catalog** and **database**

<img width="530" alt="Screenshot 2024-02-29 at 10 01 57 PM" src="https://github.com/Nike-Inc/brickflow/assets/25319738/d067d1b0-d42e-43c0-ae8a-887772d83e75">

**Cloud Information**
<!--- Go over all the following points, and put an `x` in all the boxes that apply. -->
<!--- If you're unsure about any of these, don't hesitate to ask. We're here to help! -->
- [x] AWS
- [ ] Azure
- [ ] GCP
- [ ] Other

**Desktop (please complete the following information):**
 - OS: macOS  14.3 Sonoma
 - Browser Chrome

**Additional context**
Add any other context about the problem here.