tamsanh / kedro-great

The easiest way to integrate Kedro and Great Expectations
MIT License
52 stars 14 forks source link

PyArrow dependency conflict #12

Open richard-simoes-mck opened 3 years ago

richard-simoes-mck commented 3 years ago

I'm not sure what version of pyarrow should be used with the plugin, I cannot fix the contextual version conflict message.

with pyarrow-0.17.1 Command: kedro great pkg_resources.ContextualVersionConflict: (pyarrow 0.17.1 (/Users/Richard_Simoes/.virtualenvs/py3-data/lib/python3.7/site-packages), Requirement.parse('pyarrow<2.0dev,>=1.0.0; extra == "bqstorage"'), {'google-cloud-bigquery'})

with pyarrow-1.0.1 Command: kedro great pkg_resources.ContextualVersionConflict: (pyarrow 1.0.1 (/Users/Richard_Simoes/.virtualenvs/py3-data/lib/python3.7/site-packages), Requirement.parse('pyarrow<1.0.0,>=0.12.0; extra == "pandas"'), {'kedro'})

Sitin commented 3 years ago

I confirm that I have the same conflict between kedro 0.16.6 dataset extensions and kedro-great.

From kedro build-reqs (e.g. pip-compile):

Could not find a version that matches pyarrow<1.0.0,<3.0dev,>=0.12.0,>=1.0.0 (from kedro[pandas,pandas.csvdataset,pandas.parquetdataset,yaml.yamldataset]==0.16.6->-r <my project root>/src/requirements.in (line 13))
Tried: 0.9.0, 0.10.0, 0.11.0, 0.11.0, 0.11.1, 0.11.1, 0.12.0, 0.12.0, 0.12.1, 0.12.1, 0.13.0, 0.13.0, 0.14.0, 0.14.0, 0.14.1, 0.15.0, 0.15.1, 0.15.1, 0.16.0, 0.16.0, 0.17.0, 0.17.0, 0.17.1, 0.17.1, 1.0.0, 1.0.0, 1.0.1, 1.0.1, 2.0.0, 2.0.0, 2.0.0
There are incompatible versions in the resolved dependencies:
  pyarrow<1.0.0,>=0.12.0 (from kedro[pandas,pandas.csvdataset,pandas.parquetdataset,yaml.yamldataset]==0.16.6->-r <my project root>/src/requirements.in (line 13))
  pyarrow<3.0dev,>=1.0.0 (from google-cloud-bigquery[bqstorage,pandas]==2.6.1->pandas-gbq==0.14.1->kedro[pandas,pandas.csvdataset,pandas.parquetdataset,yaml.yamldataset]==0.16.6->-r <my project root>/src/requirements.in (line 13))
Sitin commented 3 years ago

I think we can fix this by making google-cloud-bigquery optional. Similar to what requested for PySpark in [#7].