openedx / wg-devops

Issue repository for the DevOps Working Group
1 stars 2 forks source link

When building image, use layer cache from pulled image #19

Closed kdmccormick closed 4 months ago

kdmccormick commented 2 years ago

Background

When I run:

tutor images pull openedx

it pulls openedx:VERSION from Dockerhub. Docker reuses any image layers that are already in the cache, and it's a no-op if I already have the latest image downloaded. Great.

When I run:

tutor dev dc build lms

it builds openedx-dev:VERSION, but it will not reuse the image layers pulled from openedx:VERSION, even though openedx shares several dozen layers with openedx-dev! It will start right from Step 1 at the top of the Dockerfile. That makes the build take way more time and bandwidth than it needs to.

Tasks

TBD

kdmccormick commented 2 years ago

@regisb Your recent fix reminded me of this issue. No rush, but I'm curious if you have thoughts on this.

regisb commented 2 years ago

Would adding the following lines to the docker-compose-dev.yml template help?

build:
  ...
  cache_from:
    - {{ DOCKER_IMAGE_OPENEDX }}

Can you please try whether this improves the situation? (see docker-compose cache_from reference)

kdmccormick commented 2 years ago

I need to put this down for today, but here's what I've found so far...

System

Ubuntu 20.04 on AMD64 docker==20.10.21 docker-compose==1.25.0 tutor==14.2.2, installed via git, all plugins disabled

Setup

There were a few steps I had to take before Compose would even consider using the openedx image as a cache:

Test


# Clear out most of your Docker cache for accurate results
tutor local start -d lms  # ensure that your openedx image is in use
docker system prune -af  # remove all unused images, cache layers, etc
tutor local stop

# Ensure openedx image is up-to-date
tutor images pull openedx

# Try building the openedx-dev image
tutor dev dc build lms

My result

Here's the first portion of the openedx-dev build:

(venv-tutor) ~/openedx/tutor 🍀 COMPOSE_DOCKER_CLI_BUILD=1 tutor dev dc build lms
docker-compose -f /home/kyle/openedx/tutor-root/env/local/docker-compose.yml -f /home/kyle/openedx/tutor-root/env/dev/docker-compose.yml -f /home/kyle/openedx/tutor-root/env/dev/docker-compose.tmp.yml --project-name tutor_dev build lms
WARNING: Native build is an experimental feature and could change at any time
Building lms
[+] Building 265.5s (39/69)                                                                                                                                                 
 => [internal] load build definition from Dockerfile                                                                                                                   0.0s
 => => transferring dockerfile: 10.68kB                                                                                                                                0.0s
 => [internal] load .dockerignore                                                                                                                                      0.0s
 => => transferring context: 2B                                                                                                                                        0.0s
 => [internal] load metadata for docker.io/library/ubuntu:20.04                                                                                                        0.6s
 => [auth] library/ubuntu:pull token for registry-1.docker.io                                                                                                          0.0s
 => importing cache manifest from docker.io/overhangio/openedx:14.2.2                                                                                                  0.0s
 => [internal] load build context                                                                                                                                      0.0s
 => => transferring context: 13.53kB                                                                                                                                   0.0s
 => [minimal 1/2] FROM docker.io/library/ubuntu:20.04@sha256:450e066588f42ebe1551f3b1a535034b6aa46cd936fe7f2c6b0d72997ec61dbd                                          0.0s
 => => resolve docker.io/library/ubuntu:20.04@sha256:450e066588f42ebe1551f3b1a535034b6aa46cd936fe7f2c6b0d72997ec61dbd                                                  0.0s
 => => sha256:450e066588f42ebe1551f3b1a535034b6aa46cd936fe7f2c6b0d72997ec61dbd 1.42kB / 1.42kB                                                                         0.0s
 => => sha256:b25ef49a40b7797937d0d23eca3b0a41701af6757afca23d504d50826f0b37ce 529B / 529B                                                                             0.0s
 => => sha256:680e5dfb52c74a1fbc99c2922c8e25b5736e6cd1a3d9430890d52a4f8f44087a 1.46kB / 1.46kB                                                                         0.0s
 => [minimal 2/2] RUN apt update &&     apt install -y build-essential curl git language-pack-en                                                                      40.8s
 => [production  1/28] RUN apt update &&     apt install -y gettext gfortran graphviz graphviz-dev libffi-dev libfreetype6-dev libgeos-dev libjpeg8-dev liblapack-de  26.6s
 => [locales 1/1] RUN cd /tmp     && curl -L -o openedx-i18n.tar.gz https://github.com/openedx/openedx-i18n/archive/open-release/nutmeg.2.tar.gz     && tar xzf /tmp/  3.4s
 => [dockerize 1/1] RUN dockerize_url="https://github.com/powerman/dockerize/releases/download/v0.16.0/dockerize-linux-$(uname -m | sed 's@aarch@arm@')"     && echo   1.6s
 => [code 1/7] RUN mkdir -p /openedx/edx-platform &&     git clone https://github.com/openedx/edx-platform.git --branch open-release/nutmeg.2 --depth 1 /openedx/edx  10.5s
 => [python 1/4] RUN apt update &&     apt install -y libssl-dev zlib1g-dev libbz2-dev     libreadline-dev libsqlite3-dev wget curl llvm libncurses5-dev libncursesw  46.9s
 => [code 2/7] WORKDIR /openedx/edx-platform                                                                                                                           0.0s 
 => [code 3/7] RUN git config --global user.email "tutor@overhang.io"   && git config --global user.name "Tutor"                                                       0.3s
 => [code 4/7] RUN curl -fsSL https://github.com/open-craft/edx-platform/commit/3d54f284f82b61e693ad652d8d6e46a226fcb36d.patch | git am                                0.7s
 => [code 5/7] RUN curl -fsSL https://github.com/overhangio/edx-platform/commit/3f0f9eed42.patch | git am                                                              0.5s
 => [code 6/7] RUN curl -fsSL https://github.com/overhangio/edx-platform/commit/e16f8c0986.patch | git am                                                              0.5s
 => [code 7/7] RUN curl -fsSL https://github.com/overhangio/edx-platform/commit/527b4993ae.patch | git am                                                              0.7s
 => [production  2/28] RUN if [ "1000" = 0 ]; then echo "app user may not be root" && false; fi                                                                        0.6s
 => [production  3/28] RUN useradd --home-dir /openedx --create-home --shell /bin/bash --uid 1000 app                                                                  0.6s
 => [production  4/28] COPY --from=dockerize /usr/local/bin/dockerize /usr/local/bin/dockerize                                                                         0.1s
 => [production  5/28] COPY --chown=app:app --from=code /openedx/edx-platform /openedx/edx-platform                                                                    1.5s
 => [production  6/28] COPY --chown=app:app --from=locales /openedx/locale /openedx/locale                                                                             0.2s
 => [python 2/4] RUN git clone https://github.com/pyenv/pyenv /opt/pyenv --branch v2.2.2 --depth 1                                                                     1.1s
 => [python 3/4] RUN /opt/pyenv/bin/pyenv install 3.8.12                                                                                                              82.5s 
 => [python 4/4] RUN /opt/pyenv/versions/3.8.12/bin/python -m venv /openedx/venv                                                                                       2.7s 
 => [nodejs-requirements 1/6] RUN pip install nodeenv==1.6.0                                                                                                           1.1s 
 => [python-requirements  1/10] RUN apt update && apt install -y software-properties-common libmysqlclient-dev libxmlsec1-dev libgeos-dev                             18.2s 
 => [production  7/28] COPY --chown=app:app --from=python /opt/pyenv /opt/pyenv                                                                                        1.0s 
 => [nodejs-requirements 2/6] RUN nodeenv /openedx/nodeenv --node=12.13.0 --prebuilt                                                                                   3.5s 
 => [nodejs-requirements 3/6] COPY --from=code /openedx/edx-platform/package.json /openedx/edx-platform/package.json                                                   0.0s 
 => [nodejs-requirements 4/6] COPY --from=code /openedx/edx-platform/package-lock.json /openedx/edx-platform/package-lock.json                                         0.0s 
 => [nodejs-requirements 5/6] WORKDIR /openedx/edx-platform                                                                                                            0.0s
 => [nodejs-requirements 6/6] RUN npm clean-install --verbose --registry=https://registry.npmjs.org/                                                                  16.2s
 => [python-requirements  2/10] COPY --from=code /openedx/edx-platform /openedx/edx-platform                                                                           1.3s
 => [python-requirements  3/10] WORKDIR /openedx/edx-platform                                                                                                          0.1s
 => [python-requirements  4/10] RUN pip install setuptools==62.1.0 pip==22.0.4 wheel==0.37.1                                                                           5.1s
# ....etc

Interestingly I see the line:

 => importing cache manifest from docker.io/overhangio/openedx:14.2.2

Unfortunately, the build stlil took a long time, and involved a lot of scrolling text that the fancy new buildx CLI hides. So I don't think any of the cached image layers were used.

If I run the same command again, I get:

(venv-tutor) ~/openedx/tutor 🍀 COMPOSE_DOCKER_CLI_BUILD=1 tutor dev dc build lms
docker-compose -f /home/kyle/openedx/tutor-root/env/local/docker-compose.yml -f /home/kyle/openedx/tutor-root/env/dev/docker-compose.yml -f /home/kyle/openedx/tutor-root/env/dev/docker-compose.tmp.yml --project-name tutor_dev build lms
WARNING: Native build is an experimental feature and could change at any time
Building lms
[+] Building 1.7s (39/69)                                                                                                                                                   
 => [internal] load build definition from Dockerfile                                                                                                                   0.0s
 => => transferring dockerfile: 32B                                                                                                                                    0.0s
 => [internal] load .dockerignore                                                                                                                                      0.0s
 => => transferring context: 2B                                                                                                                                        0.0s
 => [internal] load metadata for docker.io/library/ubuntu:20.04                                                                                                        0.3s
 => [auth] library/ubuntu:pull token for registry-1.docker.io                                                                                                          0.0s
 => importing cache manifest from docker.io/overhangio/openedx:14.2.2                                                                                                  0.0s
 => [internal] load build context                                                                                                                                      0.0s
 => => transferring context: 750B                                                                                                                                      0.0s
 => [minimal 1/2] FROM docker.io/library/ubuntu:20.04@sha256:450e066588f42ebe1551f3b1a535034b6aa46cd936fe7f2c6b0d72997ec61dbd                                          0.0s
 => CACHED [minimal 2/2] RUN apt update &&     apt install -y build-essential curl git language-pack-en                                                                0.0s
 => CACHED [production  1/28] RUN apt update &&     apt install -y gettext gfortran graphviz graphviz-dev libffi-dev libfreetype6-dev libgeos-dev libjpeg8-dev liblap  0.0s
 => CACHED [production  2/28] RUN if [ "1000" = 0 ]; then echo "app user may not be root" && false; fi                                                                 0.0s
 => CACHED [production  3/28] RUN useradd --home-dir /openedx --create-home --shell /bin/bash --uid 1000 app                                                           0.0s
 => CACHED [dockerize 1/1] RUN dockerize_url="https://github.com/powerman/dockerize/releases/download/v0.16.0/dockerize-linux-$(uname -m | sed 's@aarch@arm@')"     &  0.0s
 => CACHED [production  4/28] COPY --from=dockerize /usr/local/bin/dockerize /usr/local/bin/dockerize                                                                  0.0s
 => CACHED [code 1/7] RUN mkdir -p /openedx/edx-platform &&     git clone https://github.com/openedx/edx-platform.git --branch open-release/nutmeg.2 --depth 1 /opene  0.0s
 => CACHED [code 2/7] WORKDIR /openedx/edx-platform                                                                                                                    0.0s
 => CACHED [code 3/7] RUN git config --global user.email "tutor@overhang.io"   && git config --global user.name "Tutor"                                                0.0s
 => CACHED [code 4/7] RUN curl -fsSL https://github.com/open-craft/edx-platform/commit/3d54f284f82b61e693ad652d8d6e46a226fcb36d.patch | git am                         0.0s
# ...etc

Notice that this second invocation shows CACHED for every layer, whereas the first invocation didn't. That implies that the first openedx-dev build still isn't using the cached openedx layers, whereas subsequent openedx-dev builds do use the cached openedx-dev layers 😕

ARMBouhali commented 1 year ago

one idea I had in my mind is to explicitly define an image tree in tutor which would allow to rebuild a given image from a chosen stage.without relying entirely on docker.

I am interested in digging into this matter when I have some time, too.

regisb commented 8 months ago

Should we keep this issue open? After all, we are really not supposed to build images with tutor dev dc build ... anymore.

kdmccormick commented 4 months ago

Yes, I believe this is fixed.