Closed wpromatt closed 2 years ago
This is really weird. I noticed that importlib-metada was downgraded in some of the constraints but in some it was not and I do not know why - I iwll keep it open until I find out the reason. In the meantime I manually updated the constraints to get rid of the conflict - it should work now.
Thanks for reporting it @wpromatt !
Thanks for handling the constraints so quickly!
Hi all, I get a similar issue when installing v. 2.1.4 with python 3.8:
pip install "apache-airflow[all]==2.1.4" --constraint "https://raw.githubusercontent.com/apache/airflow/constraints-2.1.4/constraints-3.8.txt"
Leads to
ERROR: Cannot install apache-airflow[all]==2.1.4 because these package versions have conflicting dependencies.
The conflict is caused by:
apache-airflow[all] 2.1.4 depends on google-ads<8.0.0 and >=4.0.0; extra == "all"
The user requested (constraint) google-ads==14.0.0
But this is not the only conflict between requirements.txt and the setup.py. Installing the "all_dbs" extras shows a different conflict:
Running
pip install "apache-airflow[all_dbs]==2.1.4" --constraint "https://raw.githubusercontent.com/apache/airflow/constraints-2.1.4/constraints-3.8.txt"
Yields:
The conflict is caused by:
apache-airflow[all-dbs] 2.1.4 depends on mysql-connector-python<=8.0.22 and >=8.0.11; extra == "all_dbs"
The user requested (constraint) mysql-connector-python==8.0.26
Hmm. indeed, something wrong is there. I will take look more closely.
Quick update, in case you did not notice already: Same issue arises with the latest version, 2.2.1:
pip install "apache-airflow[all]"==2.2.1 --constraint "https://raw.githubusercontent.com/apache/airflow/constraints-2.2.1/constraints-3.8.txt"
ERROR: Cannot install apache-airflow[all]==2.2.1 because these package versions have conflicting dependencies.
The conflict is caused by:
apache-airflow[all] 2.2.1 depends on azure-cosmos<5 and >=4.0.0; extra == "all"
The user requested (constraint) azure-cosmos==3.2.0
Hmm. I thought about it and I think it's pretty much expected behaviour (though we might simply want to remove the bundle extras from released airflow as they make very little sense there) .
You are not supposed to use the "all" and 'all_dbs" when you are installing airflow from PyPI. Those are development-only extras which work a bit differently than the "provider" extras.
I think the problem is different - we should simply remove them from the "installable" version of airflow in PyPI because having them there is simply misleading. I will do it as a follow up of this, when I am back at home (travelling now).
Airflow has several different types of extras (https://airflow.apache.org/docs/apache-airflow/stable/extra-packages-ref.html)
Core extras - those install deps needed by some "core features" of Airlfow that are not enabled by default
Providers extras - those extras might be used to install "provider" packages (you can also install those provider packages manually as packages). In the "PyPI/packaged" version of Airflow those providers do not have the "specific" requirements, for exampel "airflow[microsoft.azure]" introduces a dependency on "apache-airflow-providers-microsoft-azure" but it does not have "azure-cosmos" dependency (this comes transitively from the azure-cosmos provider".
Bundle extras - The 'all_dbs' and "all" belong to that.
The problem is that unlike "provider" extras, the bundle extras contain "transitive" dependencies that were valid at the time of package relase. In "providers" the dependencies are transitive - from the actuallly installed providers. But those are often different than those in constraints. The constraints we generate include the dependencies of providers that were RELEASED at the time of preparing given version. In the meantime the dependencies could have changed in "main" and they could contain different dependencies than then go to "all" and "all_dbs". So in fact the "all" and "all_dbs" is really only useful when you are installing airlfow in "Development" mode from sources, not when you are installing airflow from PyPI.
You could check it yourself - if instead of [all_dbs]
you specify [apache.cassandra, apache.drill, apache.druid,apache.hdfs,apache.hive,apache.pinot,cloudant,exasol,influxdb,microsoft.mssql,mongo,mysql,neo4j,postgres,presto,trino,vertica]
- the installation should work just fine.
I think I will simply make sure to document it and clarify beheviour of bundle extras and I will remove the bundle releases from the next release of Airflow. The "bundle" release makes very little sense for PyPI installation.
WDYT?
Yep, your explanation does make sense. And indeed, if I use the extras individually as you suggested, the install does indeed work. Thanks for the clarification!
I have two thoughts from a user perspective: a) The documentation does imply, that the [all] and [all_dbs] bundle extras are indeed for production use, since the [all] bundle is described as "all _user _facing__ features. Together with the presence of a devel_all package does indeed imply that this is meant to be used by the end user.
b) Having bundles of extras is very convenient for end users, given the large number of extras there are. So if there would be a way to provide it to the end user without running into conflicts would be highly desirable.
a) The documentation does imply, that the [all] and [all_dbs] bundle extras are indeed for production use, since the [all] bundle is described as "all user facing features. Together with the presence of a devel_all package does indeed imply that this is meant to be used by the end user.
Yeah. Doc update will be necessary if we remove them.
b) Having bundles of extras is very convenient for end users, given the large number of extras there are. So if there would be a way to provide it to the end user without running into conflicts would be highly desirable.
Good point. I think it can be achieved with a little update to our setup python code. I will definitely look to that.
@potiuk was there ever a solution for this? We are upgrading our MWAA instance from 2.0.2 to 2.2.2 and we are having conflicts between flake8 and importlib-metadata.
Using this constraints file
I am not what you install and what you have in the images (and how you are installing dependencies).
You likely have completely different problem and you simply want to instal conflicting versions of requirements manually.
The problem described in this issue is only when "all" extra was used (which should never be used in production - it is development only setting). I am not sure at all even why you have problems with flake8 because it is a "devel" only dependency and you CERTAINLY should not install it in production.
If you need some help with - please open a GitHub Discussion or slack conversation and describe it in details (also please make sure to check describe what are the dependencies imposed by MWAAA - this is not something that we know of, and it might be hte confilciting dependencies of yours are coming with some conflicts there - but MWAA issues should be handled thriough MWAA support.
All the different scenarios that you can install and upgrade airflow when you are not using a managed version are described here : https://airflow.apache.org/docs/apache-airflow/stable/installation/installing-from-pypi.html#installation-and-upgrade-scenarios - if you do it differently or MWAAA imposes other limitations then you might generate some conflicts, but that's beyond the "generic airflow" domain.
This has been documented in https://github.com/apache/airflow/pull/23697 in docs. closing.
Apache Airflow version
2.2.0 (latest released)
Operating System
all
Versions of Apache Airflow Providers
No response
Deployment
Other
Deployment details
Python 3.7
What happened
The versions of
flake8
andimportlib-metadata
specified in the constraints file are incompatible for python 3.7.In the constraints file: we have
importlib-metadata==4.8.1
andflake8==4.0.1
.flake8==4.0.1
, however, requires<4.3
:flake8 4.0.1 depends on importlib-metadata<4.3; python_version < "3.8"
What you expected to happen
Installing dependencies using
--constraint https://raw.githubusercontent.com/apache/airflow/constraints-2.2.0/constraints-3.7.txt
should be possible without conflicts.How to reproduce
python3.7
output:
Anything else
No response
Are you willing to submit PR?
Code of Conduct