StatCan / aaw-kubeflow-containers

Containers built to be used with Kubeflow for Data Science
Other
24 stars 21 forks source link

Create Good Virtual Environment, Possibly Enable by Default #304

Closed StanHatko closed 1 month ago

StanHatko commented 2 years ago

In the jupyterlab-cpu image there are relatively few packages present by default. Adding additional packages to the main system can cause conflicts with the kubeflow system (as I remember occurred before for PyTorch). Fortunately it's possible to create a virtual environment with many additional packages. It would probably be convenient for users to have such a virtual environment installed by default, rather than having to reinstall packages every time. Here is a conda environment with a bunch of packages (if you would suggest more I can add them):

conda create -n pycpu python==3.8.8 ipython==7.30.1 \
    gdal==3.3.3 geopandas==0.10.2 numpy==1.21.4 opencv==4.5.3 pandas==1.3.5 rasterio==1.2.10 scikit-learn==1.0.1 scipy==1.7.3 xgboost==1.5.0 \
    pytorch==1.4.0 torchaudio==0.4.0 torchvision==0.5.0 cpuonly==2.0 \
    -c pytorch -c conda-forge

These packages (other than the pytorch ones with the GPU versions already present in the jupyterlab-pytorch image) could also be added to the torch environment in the jupyterlab-pytorch image.

In addition, it would be good to have this virtual environment enabled by default. Here is what would need to be done:

StanHatko commented 2 years ago

I can create a pull request, but first I want to make sure if the full changes here, part of them (like having this virtual environment present but not enabled by default), or none of them are wanted by the AAW team.

StanHatko commented 2 years ago

I'll create the pull request tomorrow so it can be tested.

StanHatko commented 2 years ago

Created the pull request https://github.com/StatCan/aaw-kubeflow-containers/pull/306 which adds the pycpu environment with the command described above (with slight adjustments like --yes added). The full command is:

RUN conda create -n pycpu --yes \
      python==3.8.8 ipython==7.30.1 \
      gdal==3.3.3 geopandas==0.10.2 numpy==1.21.4 opencv==4.5.3 pandas==1.3.5 rasterio==1.2.10 scikit-learn==1.0.1 scipy==1.7.3 xgboost==1.5.0 \
      pytorch==1.4.0 torchaudio==0.4.0 torchvision==0.5.0 cpuonly==2.0 \
      -c pytorch -c conda-forge && \
    conda clean --all -f -y && \
    fix-permissions $CONDA_DIR && \
    fix-permissions /home/$NB_USER

It works for me on localhost, where interestingly the bash -l thing didn't occur (conda activate pycpu worked in the default bash for me, which displayed the AAW logo and everything).

For now I didn't make it the default (by editing .bashrc), I just created the conda env. I want to do things one step at a time, first let's make sure it works on AAW and then we may choose to change the default settings.

brendangadd commented 2 years ago

@blairdrummond @bryanpaget ^

Jose-Matsuda commented 2 years ago

@StanHatko You can now use these changes if you'd like since it has been merged on master (can use say jupyterlab-cpu:v1) but I decided to check out how big the image is and it's now, from when I built just jupyterlab-cpu the image size is now at a formidable 13.1GB

I got bit in the behind this morning by this because one of github runners in our ci was failing with this change (specifically for the SAS image) as it was just a bit too much for it to handle. https://github.com/StatCan/aaw-kubeflow-containers/pull/357#issuecomment-1152423782

This is something we may want to revert if another solution isn't prone to really increasing the dockerfile size is not found (this sha will work as well dadf7109)

StanHatko commented 2 years ago

@Jose-Matsuda It's unfortunate that the conda env is making the docker image too large.

One thing that I think could be useful is adding a few pre-tested conda requirements.txt files, created with

conda list -e >req-pycpu.txt

The req-pycpu.txt could be added to the image, and used by users if necessary as follows

conda create -n pycpu --file req-pycpu.txt
Souheil-Yazji commented 1 month ago

Closed as Stale. Users should create their own Venv regardless.