OpenMined / PipelineDP

PipelineDP is a Python framework for applying differentially private aggregations to large datasets using batch processing systems such as Apache Spark, Apache Beam, and more.
https://pipelinedp.io/
Apache License 2.0
274 stars 77 forks source link

python-dp version conflict #521

Open jpgoldberg opened 4 months ago

jpgoldberg commented 4 months ago

Description

The installation instructions lead to errors.

I have tested with different Python versions, each producing the same problem, but for what follows I am using Python 3.10.6 as the project is listed for 3.10 but not above. (See System Information below for details)

How to Reproduce

Environment and versions:

  1. run pip install pipeline-dp

Expected Behavior

Successful installation

Screenshots

ERROR: Cannot install pipeline-dp==0.1.0, pipeline-dp==0.1.1, pipeline-dp==0.2.0 and pipeline-dp==0.2.1 because these package versions have conflicting dependencies.

The conflict is caused by:
    pipeline-dp 0.2.1 depends on python-dp>=1.1.5rc4
    pipeline-dp 0.2.0 depends on python-dp<2.0.0 and >=1.1.1
    pipeline-dp 0.1.1 depends on python-dp<2.0.0 and >=1.1.1
    pipeline-dp 0.1.0 depends on python-dp<2.0.0 and >=1.1.1

System Information

Additional Context

More complete transcript

 % pip install pipeline-dp
Collecting pipeline-dp
  Using cached pipeline_dp-0.2.1-py2.py3-none-any.whl.metadata (4.9 kB)
Requirement already satisfied: numpy<2.0.0,>=1.20.1 in /Users/jeffrey/.pyenv/versions/3.10.6/lib/python3.10/site-packages (from pipeline-dp) (1.23.3)
INFO: pip is looking at multiple versions of pipeline-dp to determine which version is compatible with other requirements. This could take a while.
  Using cached pipeline_dp-0.2.0-py2.py3-none-any.whl.metadata (5.5 kB)
  Using cached pipeline_dp-0.1.1-py2.py3-none-any.whl.metadata (5.5 kB)
  Using cached pipeline_dp-0.1.0-py2.py3-none-any.whl.metadata (5.0 kB)
Collecting absl-py<2.0.0,>=1.0.0 (from pipeline-dp)
  Using cached absl_py-1.4.0-py3-none-any.whl.metadata (2.3 kB)
Collecting apache-beam<3.0.0,>=2.35.0 (from pipeline-dp)
  Using cached apache_beam-2.56.0.tar.gz (2.4 MB)
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Preparing metadata (pyproject.toml) ... done
Collecting dp-accounting<0.0.3,>=0.0.2 (from pipeline-dp)
  Using cached dp_accounting-0.0.2-py3-none-any.whl.metadata (1.7 kB)
Collecting pyspark<4.0.0,>=3.2.0 (from pipeline-dp)
  Using cached pyspark-3.5.1.tar.gz (317.0 MB)
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Installing backend dependencies ... done
  Preparing metadata (pyproject.toml) ... done
ERROR: Cannot install pipeline-dp==0.1.0, pipeline-dp==0.1.1, pipeline-dp==0.2.0 and pipeline-dp==0.2.1 because these package versions have conflicting dependencies.

The conflict is caused by:
    pipeline-dp 0.2.1 depends on python-dp>=1.1.5rc4
    pipeline-dp 0.2.0 depends on python-dp<2.0.0 and >=1.1.1
    pipeline-dp 0.1.1 depends on python-dp<2.0.0 and >=1.1.1
    pipeline-dp 0.1.0 depends on python-dp<2.0.0 and >=1.1.1

To fix this you could try to:
1. loosen the range of package versions you've specified
2. remove package versions to allow pip attempt to solve the dependency conflict

ERROR: ResolutionImpossible: for help visit https://pip.pypa.io/en/latest/topics/dependency-resolution/#dealing-with-dependency-conflicts

Note that I was able to install manually

dp-accounting forced a downgrade of absl-py, but the versions I ended up with after manually installing those in the above order was

% pip list | egrep '(apache-beam)|(absl-py)|(dp-accounting)|(pyspark)'
absl-py                           1.4.0
apache-beam                       2.56.0
dp-accounting                     0.4.4
pyspark                           3.5.1
dvadym commented 4 months ago

Thanks for filing this issue! I'll try to find a Mac and fix it.