InseeFrLab / images-datascience

Collection of Docker images to build the data science catalog of the Onyxia project
MIT License
24 stars 22 forks source link

Docker image including R, Python and Julia #79

Closed fBedecarrats closed 1 year ago

fBedecarrats commented 1 year ago

This issue was initially posted on the helm-charts-interactive-services repo, I move it following @avouacr recommendation. I have a need for a docker image including R + python + Julia with VSCode. A solution to my need would be:

In principles, I would be happy to fork images-datascience and propose a PR. But I don't see clearly how I could run tests to build my work-in-progess Docker images and launch them in SSP-Cloud to check that they work before submitting PRs. Is it possible and is there some documentation on this? Or should I rely on the support of the maintainers to create the experimental docker-image and make it available on SSP service catalog?

avouacr commented 1 year ago

Hi @fBedecarrats,

I added a multi-language image in #82 and the corresponding service on the SSP Cloud, which you are welcome to test ! As we need both to limit the size of our images and comply with the quite complicated build workflow of this repo, we chose to build it with our r-minimal as base, adding a minimal installation of Python with reticulate, and a minimal installation of Julia. I also added the R extension for VSCode. In order to tailor this image to your needs, I encourage you to build your own Docker image based on it. Something in the lines of :

FROM inseefrlab/onyxia-vscode-r-python-julia:r4.2.1

USER root

RUN /rocker_scripts/install_geospatial.sh && \
    # Install additional R packages
    install2.r -e -s your_list_of_r_packages && \
    # Install additional Python packages
    pip install your_list_python_packages

USER 1000

Then, you can use this image as custom image when configuring a VSCode service on the SSP Cloud. And since I know you are a devoted RStudio user, you could even install it yourself in the image, starting instead FROM inseefrlab/onyxia-r-python-julia:r4.2.1 and taking inspiration from what we do in our image.

Regarding contributions to the project, we are working currently on improving our tests pipeline and provide users a way to automatically test their pull requests. I'll keep you updated :)

fBedecarrats commented 1 year ago

Thanks a lot @avouacr ! Unfortunately it seems that the service cannot launch. I tried several times since this morning, without success. I initially thought that it was due to another pod I was running that was consuming too many resources, but now that this other pod has been deleted, this new service still refuses to launch. On Onyxia interface, I am clicking on "Catalogue de services" > "Services interactifs" > "Afficher tous" > "Vscode-r-python-julia" > "Lancer" > "Lancer". On "My services" I see the card appearing corresponding to this service, but "exécution depuis" still displays "en cours" with a wheel that wheels endlessly.

avouacr commented 1 year ago

There was an issue with the permissions given to the init script, it should be working now

fBedecarrats commented 1 year ago

Yes it works, thank you! I'll figure out how to build the image you recommend on dockerhub and I'll report back.

fBedecarrats commented 1 year ago

Sorry to bother you but I fail to create a custom image, building upon the R + Python + Julia image as you suggested. I am trying to do this with a local computer (WSL + Docker Desktop). But it fails and I cannot figure out where the problem comes from. I have a couple of ideas, but I am not sure:

Here is the simplified Dockerfile I try to run:

FROM inseefrlab/onyxia-vscode-r-python-julia:r4.2.1

USER root

RUN /rocker_scripts/install_geospatial.sh

USER 1000

And here the error I get:

PS C:\Users\fbede\Documents\Statistiques> docker build docker_pa_matching
[+] Building 50.5s (5/5) FINISHED
 => [internal] load build definition from Dockerfile                                                               0.0s
 => => transferring dockerfile: 339B                                                                               0.0s
 => [internal] load .dockerignore                                                                                  0.0s
 => => transferring context: 2B                                                                                    0.0s
 => [internal] load metadata for docker.io/inseefrlab/onyxia-vscode-r-python-julia:r4.2.1                          0.9s
 => CACHED [1/2] FROM docker.io/inseefrlab/onyxia-vscode-r-python-julia:r4.2.1@sha256:24b6b8a449811dc645886346ef3  0.0s
 => ERROR [2/2] RUN /rocker_scripts/install_geospatial.sh                                                         49.4s
------
 > [2/2] RUN /rocker_scripts/install_geospatial.sh:
#5 0.462 find: ‘/var/lib/apt/lists/*’: No such file or directory
#5 5.634 Err:1 http://archive.ubuntu.com/ubuntu focal InRelease
#5 5.634   403  connecting to archive.ubuntu.com:80: connecting to 91.189.91.39:80: dial tcp 91.189.91.39:80: i/o timeout [IP: 91.189.91.39 80]
#5 5.634 Err:2 http://security.ubuntu.com/ubuntu focal-security InRelease
#5 5.634   403  connecting to security.ubuntu.com:80: connecting to 91.189.91.39:80: dial tcp 91.189.91.39:80: i/o timeout [IP: 91.189.91.39 80]
#5 40.74 Err:3 http://archive.ubuntu.com/ubuntu focal-updates InRelease
#5 40.74   403  connecting to archive.ubuntu.com:80: connecting to 185.125.190.36:80: dial tcp 185.125.190.36:80: i/o timeout [IP: 185.125.190.36 80]
#5 45.55 Get:4 http://archive.ubuntu.com/ubuntu focal-backports InRelease [108 kB]
#5 48.96 Get:5 http://archive.ubuntu.com/ubuntu focal-backports/main amd64 Packages [55.2 kB]
#5 49.17 Get:6 http://archive.ubuntu.com/ubuntu focal-backports/universe amd64 Packages [28.6 kB]
#5 49.33 Reading package lists...
#5 49.35 E: The repository 'http://archive.ubuntu.com/ubuntu focal InRelease' is not signed.
#5 49.35 E: Failed to fetch http://archive.ubuntu.com/ubuntu/dists/focal/InRelease  403  connecting to archive.ubuntu.com:80: connecting to 91.189.91.39:80: dial tcp 91.189.91.39:80: i/o timeout [IP: 91.189.91.39 80]
#5 49.35 E: Failed to fetch http://security.ubuntu.com/ubuntu/dists/focal-security/InRelease  403  connecting to security.ubuntu.com:80: connecting to 91.189.91.39:80: dial tcp 91.189.91.39:80: i/o timeout [IP: 91.189.91.39 80]
#5 49.35 E: The repository 'http://security.ubuntu.com/ubuntu focal-security InRelease' is not signed.
#5 49.35 E: Failed to fetch http://archive.ubuntu.com/ubuntu/dists/focal-updates/InRelease  403  connecting to archive.ubuntu.com:80: connecting to 185.125.190.36:80: dial tcp 185.125.190.36:80: i/o timeout [IP: 185.125.190.36 80]
#5 49.35 E: The repository 'http://archive.ubuntu.com/ubuntu focal-updates InRelease' is not signed.
------
executor failed running [/bin/bash -c /rocker_scripts/install_geospatial.sh]: exit code: 100
avouacr commented 1 year ago

I would say neither, it looks more like a network issue to me : your machine can't reach the apt repositories so it timeouts. Are you behind a proxy maybe? An easy solution would be to build your image using a CI. See: https://docs.github.com/en/actions/publishing-packages/publishing-docker-images

fBedecarrats commented 1 year ago

Wait : it seems that it's a problem specific to my config and that a wsl sudo apt-get update && apt-get upgrade does the job. It seems to be building now: I'll re-open if it fails. Sorry for reopenning this issue too hastily.

fBedecarrats commented 1 year ago

It runs ~42 minutes and fails again with connection problems. I think I'll try from Github Actions directly instead of my personal computer.

fBedecarrats commented 1 year ago

It finally worked! For the record, here is the Dockerfile and there the docker image. I should rename this monstrous dockerfile Frankenstein, but at least I can reference it as a custom service and... it's alive ! It's alive! (reference) Thanks a lot for your help @avouacr !