Nike-Inc / brickflow

Pythonic Programming Framework to orchestrate jobs in Databricks Workflow
https://engineering.nike.com/brickflow/
Apache License 2.0
183 stars 39 forks source link

[BUG] Unable to add airflow library #66

Closed kranthikiran closed 9 months ago

kranthikiran commented 9 months ago

Describe the bug I'm having issue with the airflow library in Databricks.I've followed all the steps in the documentation, but it seems like the 'enable plugin' parameter is not being passed. Additionally, I manually passing the library , like PypiTaskLibrary(package=“apache-airflow==2.6.3") in the libraries , but no luck

Screenshots image

image

Cloud Information

BrendBraeckmans commented 9 months ago

Hi,

On your local machine you indeed need to install airflow. I have below in my pyproject.toml :

[tool.poetry.group.dev.dependencies]
apache-airflow = "2.6.3"

On Databricks that’s not necessary as this is done by default when you specify enable_plugins=True in your entrypoint.py as mentioned at below links: https://engineering.nike.com/brickflow/v0.10.3/faq/faq/?h=enable#how-do-i-enable-airflow-features https://engineering.nike.com/brickflow/v0.10.3/upgrades/upgrade-pre-0-10-0-to-0-10-0/#upgrade-checklist

I however did notice that you sometimes also need to put enable_plugins: true manually in your .brickflow-project-root.yml as a colleague struggled with the issue you mention. Mine looks like this:

# DO NOT MODIFY THIS FILE - IT IS AUTO GENERATED BY BRICKFLOW AND RESERVED FOR FUTURE USAGE
projects:
  my-project:
    brickflow_version: 0.10.3
    deployment_mode: bundle
    enable_plugins: true
    name: my-project
    path_from_repo_root_to_project_root: .
    path_project_root_to_workflows_dir: src/workflows
version: v1
kranthikiran commented 9 months ago

I have added pyproject.toml file and added required packages. Now I am able to see airflow package in brickflow.