OpenMined / PyDP

The Python Differential Privacy Library. Built on top of: https://github.com/google/differential-privacy
Apache License 2.0
500 stars 138 forks source link

PyDP does not work in the Docker container #366

Open replomancer opened 3 years ago

replomancer commented 3 years ago

Description

Importing pydp fails inside containers running image built from Dockerfile.

How to Reproduce

Build the image and run the container:

docker build -t pydp:test .
docker run -it pydp:test

Result of make test inside the container:

  ValueError

  Directory dist/python_dp-1.0.3-cp39-cp39-linux_x86_64.whl does not exist

You can also run python inside the container and try this import:

>>> import pydp
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/root/PyDP/.venv/lib/python3.9/site-packages/pydp/__init__.py", line 5, in <module>
    from pydp import algorithms
  File "/root/PyDP/.venv/lib/python3.9/site-packages/pydp/algorithms/__init__.py", line 2, in <module>
    from . import laplacian
  File "/root/PyDP/.venv/lib/python3.9/site-packages/pydp/algorithms/laplacian/__init__.py", line 2, in <module>
    from ._bounded_algorithms import BoundedMean
  File "/root/PyDP/.venv/lib/python3.9/site-packages/pydp/algorithms/laplacian/_bounded_algorithms.py", line 2, in <module>
    from .._algorithm import BoundedAlgorithm
  File "/root/PyDP/.venv/lib/python3.9/site-packages/pydp/algorithms/_algorithm.py", line 7, in <module>
    from .._pydp import _algorithms
ModuleNotFoundError: No module named 'pydp._pydp'

System Information

nadavaviv commented 3 years ago

I encountered an issue with docker too... This is what I get when I try to import pydp from inside the container

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.7/site-packages/pydp/__init__.py", line 5, in <module>
    from pydp import algorithms
  File "/usr/local/lib/python3.7/site-packages/pydp/algorithms/__init__.py", line 2, in <module>
    from . import laplacian
  File "/usr/local/lib/python3.7/site-packages/pydp/algorithms/laplacian/__init__.py", line 2, in <module>
    from ._bounded_algorithms import BoundedMean
  File "/usr/local/lib/python3.7/site-packages/pydp/algorithms/laplacian/_bounded_algorithms.py", line 2, in <module>
    from .._algorithm import BoundedAlgorithm
  File "/usr/local/lib/python3.7/site-packages/pydp/algorithms/_algorithm.py", line 7, in <module>
    from .._pydp import _algorithms
ImportError: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.26' not found (required by /usr/local/lib/python3.7/site-packages/pydp/_pydp.so)

This is the dockerfile:

FROM python:3.7.6-buster

RUN apt-get update -y && apt-get install -y python3-dev libstdc++6

WORKDIR /usr/src/app

COPY requirements.txt .
RUN pip3 install -r requirements.txt

COPY . .

EXPOSE 80

ENTRYPOINT [ "waitress-serve" ]
CMD [ "--call", "app:main" ]

Thoughts?

idhamari commented 1 year ago

I have a similar error when trying to install in linux (without docker).

Probably it would be nice to add instruction to the readme file to show how to install from source.

         ./prereqs_linux.sh
         ./build_PyDP.sh
         pip install .
         cp src/pydp/_pydp.so ~/.local/lib/python3.10/site-packages/pydp
         #Test
          python -c "import pydp as dp"

Thoughts?

I would try to run the command from the Dockerfile manually inside the Docker container and check which one has error. e.g. this link is missing

        ARG BAZELISK_DOWNLOAD_URL=https://github.com/bazelbuild/bazelisk/releases/download/ 

and should be replaced by something like

       ARG BAZELISK_DOWNLOAD_URL=https://github.com/bazelbuild/bazelisk/releases/download/v1.17.0/bazelisk-linux-amd64

In my case, I modified the result in _get_python_include function in this file ~/.cache/bazel/_bazel_user/3738acce07ef18f8aca7932bae86e827/external/pybind11_bazel/python_configure.bzl

            def _get_python_include(repository_ctx, python_bin):
                """Gets the python include path."""

                result = _execute(
                    repository_ctx,
                    [
                        python_bin,
                        "-c",
                        "from __future__ import print_function; import sysconfig; print(sysconfig.get_paths()['include'])",
                    ],

                    error_msg = "Problem getting python include path.",
                    error_details = ("Is the Python binary path set up right? " +
                                    "(See ./configure or " + _PYTHON_BIN_PATH + ".) " +
                                    "Is distutils installed?"),
                )
                return result.stdout.splitlines()[0]

I also solve the issue by updating the WORKSPACE file to use the recent library:

        http_archive(
              name = "pybind11_bazel",
              #strip_prefix = "pybind11_bazel-26973c0ff320cb4b39e45bc3e4297b82bc3a6c09",
              #urls = ["https://github.com/pybind/pybind11_bazel/archive/26973c0ff320cb4b39e45bc3e4297b82bc3a6c09.zip"],
           strip_prefix = "pybind11_bazel-master",
             urls = ["https://github.com/pybind/pybind11_bazel/archive/refs/heads/master.zip"],
           )