kedro-org / kedro-plugins

First-party plugins maintained by the Kedro team.
Apache License 2.0
93 stars 89 forks source link

Make Kedro Compatible with Airflow 2.4.2 #73

Closed rxm7706 closed 1 year ago

rxm7706 commented 1 year ago

Description

Airflow is one of the primary deployment solutions for Kedro Airflow 2.4.2 has a dependency of attrs>=22.1.0 Kedro requirements uses attrs~=21.3 https://github.com/kedro-org/kedro/blob/c40873a5d4adfbc7b5970cb8b0c1133d53463bbf/dependency/requirements.txt#L2

Context

Unable to install Kedro & Airflow using conda in the same environment

Steps to Reproduce

  1. mamba create -n kedro-airflow "python=3.7.12" "airflow=2.4.2" "kedro>=0.18.3" Encountered problems while solving:
    • package airflow-2.4.2-py310h51d5547_0 requires attrs >=22.1.0, but none of the providers can be installed

Expected Result

With just the change of attrs - from attrs~=21.3 to "attrs >=22.1.0,<22.2"

Kedro, Airflow and MLFlow are compatible

mamba create -n kedro-airflow "python=3.7.12" "anyconfig >=0.10.0,<0.11.0" "attrs >=21.3,<22.2" "cachetools >=4.1,<5.0" "click <9.0" "cookiecutter >=2.1.1,<3.0" "dynaconf >=3.1.2, <4.0" "fsspec >=2021.4,<=2022.5.0" "gitpython >=3.0,<4.0" "importlib_metadata >=3.6" "importlib_resources >=1.3" "jmespath >=0.9.5,<1.0" "jupyter_client >=5.1,<7.0" "pip-tools >=6.5,<7.0" "pluggy >=1.0,<1.1" "python >=3.7,<3.11" "python-json-logger >=2.0.0,<3.0.0" "pyyaml >=4.2,<7.0" "rich >=12.0,<13.0" "rope >=0.21.0,<0.22.0" "setuptools >=38.0" "toml >=0.10,<0.11" "toposort >=1.5,<2.0" "airflow=2.4.2" "mlflow >=1.0.0,<2.0.0"

Actual Result

Draft Pull for conda-feedstock shows build is possible - Restoring compatibility with Airflow & MLFlow https://github.com/conda-forge/kedro-feedstock/pull/25

Your Environment

Include as many relevant details about the environment in which you experienced the bug:

rxm7706 commented 1 year ago

@merelcht @deepyaman I've created a PR that seems to pass all required tests for Kedro

Once Merged - It appears that

https://github.com/kedro-org/kedro/pull/2030 & https://github.com/kedro-org/kedro-plugins/pull/52

together will address all Airflow / Kedro compatibility issues.

noklam commented 1 year ago

I am curious if this happens for pip too? Does it stops you completely from installing the packages?

I expect the older version of attrs will still work but it happens that kedro pinned a older version.

rxm7706 commented 1 year ago

@noklam It should give the same error with pip - Quick Test gives the following error

INFO: pip is looking at multiple versions of to determine which version is compatible with other requirements. This could take a while. INFO: pip is looking at multiple versions of apache-airflow to determine which version is compatible with other requirements. This could take a while. ERROR: Cannot install apache-airflow==2.4.2 and kedro==0.18.3 because these package versions have conflicting dependencies.

The conflict is caused by: apache-airflow 2.4.2 depends on attrs>=22.1.0 kedro 0.18.3 depends on attrs~=21.3

To fix this you could try to:

  1. loosen the range of package versions you've specified
  2. remove package versions to allow pip attempt to solve the dependency conflict

ERROR: ResolutionImpossible: for help visit https://pip.pypa.io/en/latest/topics/dependency-resolution/#dealing-with-dependency-conflicts