Closed edublancas closed 2 years ago
I think I am getting a sense of this issue. But how do we know which packages are not on PyPI? Maybe try look into requirement.txt and see if it starts with git? Or can we just download all packages of users under their lib/ dir?
So here is my understanding, the command that creates a docker image is sooperviosr add. And the result of the docker looks like this FROM condaforge/mambaforge:4.10.1-0
COPY requirements.lock.txt project/requirements.lock.txt
RUN pip install --requirement project/requirements.lock.txt && rm -rf /root/.cache/pip/
COPY dist/* project/ WORKDIR /project/
RUN tar --strip-components=1 -zxvf *.tar.gz RUN cp -r /project/ploomber/ /root/.ploomber/ || echo 'ploomber home does not exist'
So instead of just run "RUN pip install --requirement project/requirements.lock.txt && rm -rf /root/.cache/pip/" We should add something like "Run pip install --root
Not pip install.
We should first check if a dependency lib exist, if so we should import it to the image:
lib/
package_a/
package_b/
Then we should export the path of the lib into PYTHONPATH so that's available for usage within python.
Yeah so downloading the packages is up to the user (maybe they just copy-paste the files into lib). The assumption is that the user knows how to get the package's source code into lib/
- we don't have to figure out of they're on PyPI or not.
The solution is to ensure that whatever is on lib/
is importable from Python, to do that, we need to ensure the folder is on PYTHONPATH.
Note that lib/
and requirements.txt
can be used at the same time, they are not mutually exclusive.
So to check if a dependency lib exist, I did something like this "python -m site --user-site" /Users/xilinwang/Library/Python/2.7/lib/python/site-packages to get all packages for python. Is this lib/ the one you been talking about? So the rest is to add in Dockerfile something like this: ENV PYTHONPATH "${PYTHONPATH}:/Users/xilinwang/Library/Python/2.7/lib/python/site-packages"?
On the second part yes, on the first no. This is a user custom lib that's not part of the pytho/lib path.
Sorry I am not familiar with python package stuff. Where is user's custom package located usually? Is it in their $PYTHONPATH ? For example, if i download a package from git, where exactly will it located?
No worries, this thread should give you the context you need on importing custom modules. In short, you have to have the module presents, and you need to specify the path to it as part of the interpreter's arguments (PYTHONPATH).
@Wxl19980214 Path('lib').exists() you can check if the library exists via python, then use a parameter to add it to the docker image in a similar manner to this:
{%- set name = 'environment.lock.yml' if conda else 'requirements.lock.txt' %}
COPY {{name}} project/{{name}}
Looks like django lol. I will take a look. But should I change all of the docerfiles? We have Dockerfile under different backend platforms?
Yes, start with one, see it works then replicate to the rest of the backends
Users may want to install packages that are not available publicly on PyPI. A typical use case is an internal library stored in some git repository or a private registry.
For example, assume the local environment is properly configured to authenticate with a private git repository, someone can easily install that package with a
requirements.txt
:However, in some cases,
soopervisor export
might not run locally, but in a CI/CD server; hence installing from private repositories won't work unless the CI/CD server is properly authenticated to get the repository (https://path.to/repo
).This also applies to Ploomber Cloud users since the Docker image is built remotely. In Ploomber Cloud's case is even more desirable to have this solution since users will prefer not to disclose their credentials.
The solution is for the Docker image to be able to pick up packages from a local directory. For example, a user might have the following layout:
The Dockerfile should configure the PYTHONPATH so that it includes the
lib/
directory; hence when starting Python,import package_a
andimport package_b
will work.To get
package_a
underlib
, users can usepip install
, but we should do some research to find out how to do it, these two look promising:The final thing to take into account is that for packages that are not pure-Python (e.g. the ones that use C extensions), if the local OS is different than the OS where the pipeline will be executed; we might run into issues; but I think we can ignore that for now.
cc @idomic @Wxl19980214