googledatalab / datalab

Interactive tools and developer experiences for Big Data on Google Cloud Platform.
Apache License 2.0
974 stars 249 forks source link

Different python installs in custom datalab docker #2148

Open rafa-guedes opened 5 years ago

rafa-guedes commented 5 years ago

I have created some custom datalab docker image to install python libraries we use often. The Dockerfile is shown below. The libraries however do not seem to be available on Jupyter after creating datalab instances using this image, since Jupyter uses different python binaries from some specific environments:

!which python
/usr/local/envs/py2env/bin/python
!which python3
/usr/local/envs/py3env/bin/python3

What would be the correct way to install python libraries in a custom docker image, or how can I prescribe the python binary I want to use with datalab please?

Thanks

FROM gcr.io/cloud-datalab/datalab:latest

COPY ./requirements.txt /tmp

RUN echo "--------------- Adding missing pubkeys ---------------" && \
    apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 04EE7237B7D453EC && \
    apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 648ACFD622F3D138

RUN echo "--------------- Installing system packages ---------------" && \
    apt -y update && \
    apt install -y cython \
                   cython3 \
                   libgeos-dev \
                   libproj-dev \
                   libsnappy-dev \
                   python3-dev \
                   python3-pip

RUN echo "--------------- Installing python2 packages ---------------" && \
    cd /tmp && \
    pip install --upgrade --no-cache-dir pip && \
    pip install --upgrade --no-cache-dir -r requirements.txt

RUN echo "--------------- Installing python3 packages ---------------" && \
    cd /tmp && \
    pip3 install --upgrade --no-cache-dir numpy && \
    pip3 install --upgrade --no-cache-dir -r requirements.txt