containers / buildah

A tool that facilitates building OCI images.
https://buildah.io
Apache License 2.0
7.36k stars 776 forks source link

caching buildah images (base + created) using github actions/cache #2954

Open himanshumps opened 3 years ago

himanshumps commented 3 years ago

I want to cache the buildah images so that they are not pulled every time using github actions/cache. Can you please help me understand what layers should be cached. I started with ~/var/lib/containers and /var/lib/containers but that did not help

jobs:
  # This workflow contains a single job called "greet"
  build:
    # The type of runner that the job will run on
    runs-on: ubuntu-latest
    steps:
    - uses: actions/checkout@master
    - name: Set up JDK 1.8
      uses: actions/setup-java@v1
      with:
        java-version: 1.8
    - name: Install oc tool
      run: mkdir /tmp/s2i/ && cd /tmp/s2i/ && curl -s https://api.github.com/repos/openshift/source-to-image/releases/latest | grep browser_download_url | grep linux-amd64 | cut -d '"' -f 4 | wget -qi - && tar xvf source-to-image*.gz && sudo mv s2i /usr/local/bin && rm -rf /tmp/s2i/
    - name: Login to quay.io using Buildah
      run: buildah login -u ${{ secrets.QUAY_USERNAME }} -p ${{ secrets.QUAY_PASSWORD }} quay.io
    - name: Cache local Maven repository & buildah containers
      uses: actions/cache@v2
      with:
        path: |
          ~/.m2/repository
        key: ${{ runner.os }}-maven-buildah-${{ hashFiles('**/pom.xml') }}
        restore-keys: |
          ${{ runner.os }}-maven-buildah-
    - name: Build with Maven
      run: pwd && mvn -B clean install -DskipTests -f ./vertx-starter/pom.xml
    - name: Create the Dockerfile
      run: printf "FROM openjdk:8-slim\nARG JAR_FILE=target/*.jar\nRUN mkdir -p /deployments\nCOPY \${JAR_FILE} /deployments/app.jar\nRUN chown -R 1001:0 /deployments && chmod -R 0755 /deployments\nEXPOSE 8080 8443\n USER 1001\nENTRYPOINT [\"java\",\"-jar\",\"/deployments/app.jar\"]" > Dockerfile && cat Dockerfile
    - name: Create image using buildah
      run: buildah bud --layers --log-level debug --build-arg JAR_FILE="vertx-starter/target/app.jar" -t quay.io/himanshumps/vertx_demo .
TomSweeneyRedHat commented 3 years ago

@himanshumps I'm not exactly clear what you're trying to do, to what exactly github actions/cache is caching between runs. So just a quick stab, can you add the option --pull to your buildah bud command? That should pull the image from the repository only if it or some of its layers are newer on the registry than the local ones.

@cevich, any thoughts?

cevich commented 3 years ago

@cevich, any thoughts?

actions/cache is new to me, I've only ever dabbled with the archive action. Let me stare at your YAML for a few minutes and see if I can wrap my brain around it...

cevich commented 3 years ago

...okay I understand what you're trying to do here. Yeah I don't think caching the underlying container storage is a good idea, that's not and never was intended to be transported out of the original context. Regardless, I attempted to do something very similar in a CI setup about a year ago, and had lots of trouble getting it to work (even setting --timestamp 0) reliably. The "proper" way to transport image layers, is to encode/decode them using a standardized format (like docker-archive).

The setup I tried using worked like this: Before the build, I would buildah pull docker-archive:... after restoring cache (conditional on the file existing). After the build I would buildah push $IMAGE docer-archive:... into a file, then cache the file. However, the complexity of this setup was ultimately its downfall (i.e. the scripts and caching mechanisms kept breaking for one reason or another). Though you're welcome to try this approach, perhaps a year of buildah development and a different CI system will help it be successful.

For my setup, what ended up being much simpler and more reliable, was using the --volume option with buildah bud ... to provide a reusable store of cached bits on the host side, into the container for use at build time. This strategy worked out much more reliably (since it's simple) as it didn't depend on lots of extra steps and conditional checks. In my case, I was dealing with a lengthy Fedora package install. So it was simple to volume-mount in a cached /var/cache/dnf to speed up the overall build.

Hope that helps.

himanshumps commented 3 years ago

That helps. I can give it a shot. The reason I am planning to do it is because the GitHub runner uses the VFS mount and buildah is a bit slow with VFS. The oci archive makes sense and can help reduce build time drastically with layers using GitHub runner.

cevich commented 3 years ago

Great. I'm curious if you're able to get it working.

webknjaz commented 3 years ago

FTR there's https://github.com/redhat-actions/buildah-build that I started using recently but it is unable to reuse the layers cache, supposedly because buildah doesn't support this...

Raboo commented 3 years ago

I'm not as interested in the yum/dnf/apt/apk cache. I'm more interested in seeing a solution the original question like Docker has done with the cache-from and cache-to, https://github.com/docker/build-push-action/blob/master/docs/advanced/cache.md#github-cache.

I.e. caching the layers. Let's say you have more than 1 job with the FROM ubuntu:20.04. It makes perfect sense to cache the underlying layers in a CI pipeline. Just pretend you have 1000 different jobs in a daily basis with FROM ubuntu:20.04 or whatever FROM variants you use.. Would much rather fetch the underlying layers from GitHub Actions cache than from Docker Hub via Internet.

Sure, it wouldn't hurt to combine the image layers with a package volume mount cache and putting everything in the actions/cache. But foremost it would be nice not to download the exact same OS images every time you execute buildah in a github actions workflow.

github-actions[bot] commented 3 years ago

A friendly reminder that this issue had no activity for 30 days.

Raboo commented 3 years ago

unstale..

github-actions[bot] commented 3 years ago

A friendly reminder that this issue had no activity for 30 days.

webknjaz commented 3 years ago

unstale

rhatdan commented 3 years ago

What about using additional stores for this.

You could pull the images to /var/lib/shared and then share them amongst all of your builders?

rhatdan commented 3 years ago

https://www.redhat.com/sysadmin/image-stores-podman

cevich commented 3 years ago

WRT @Raboo comment on Mar 15 Dan's suggestion sounds feasible. Though IIRC the cache directory needs to live at or below $GITHUB_WORKSPACE for actions/cache.

Raboo commented 3 years ago

Hmm, okay so in short buildah uses /var/lib/shared or perhaps even /var/lib/shared/overlay-images to store it's images and layers? And with podman you can control the image store location with the --root argument. That is good to know.

I see no mention that you are limited to $GITHUB_WORKSPACE for actions/cache, most examples are under the $HOME. And the Docker team uses /tmp/.buildx-cache in their example. I think you should be able to to set /var/lib/shared as a cache directory. Remains to be seen.

So I assume you can create a cache in Github Actions with this information. And also other pipelines.

One follow-up question, how would buildah handle parallel builds against a shared filesystem under /var/lib/shared?

github-actions[bot] commented 3 years ago

A friendly reminder that this issue had no activity for 30 days.

github-actions[bot] commented 3 years ago

A friendly reminder that this issue had no activity for 30 days.

webknjaz commented 3 years ago

unstale

github-actions[bot] commented 3 years ago

A friendly reminder that this issue had no activity for 30 days.

flouthoc commented 3 years ago

@Raboo I think its default: "/var/lib/containers/storage" for UID 0, "$HOME/.local/share/containers/storage" for other users have you tried using buildah with --root I think it should allow you to change default path with custom one where github action is storing cache by default.

Raboo commented 3 years ago

@flouthoc okay, thanks. I don't have the time or possibility to test this right now as I'm not using buildah at the moment. But it clearly seems solvable in sense.

rhatdan commented 3 years ago

I think we need to document up how to use additional stores, @giuseppe @flouthoc Could you write up a blog on using additional stores to satisfy this need.

giuseppe commented 3 years ago

do we need to cover something that is not already in https://www.redhat.com/sysadmin/image-stores-podman or do we need some github specific recipe?

github-actions[bot] commented 2 years ago

A friendly reminder that this issue had no activity for 30 days.

github-actions[bot] commented 2 years ago

A friendly reminder that this issue had no activity for 30 days.

webknjaz commented 2 years ago

do we need to cover something that is not already in https://www.redhat.com/sysadmin/image-stores-podman or do we need some github specific recipe?

FWIW I don't think I saw the intermediate layer cache working when I tried out RH-maintained actions.

sanmai-NL commented 1 year ago

@himanshumps Can't you usepodman build --cache-from and --cache-to parameters?

libinglong commented 1 year ago

I use buildah in container with cached dir /var/lib/containers and it works for me.

github-actions[bot] commented 1 year ago

A friendly reminder that this issue had no activity for 30 days.

webknjaz commented 1 year ago

Unstale

rhatdan commented 1 year ago

@flouthoc PTAL

penn5 commented 1 year ago

The --root parameter seems to solve this? Just needs more prominent documentation...

rhatdan commented 1 year ago

Cre to open a PR?

penn5 commented 1 year ago

I'm afraid I'm busy debugging your selinux policies now :/ (probably an issue in my setup)

penn5 commented 1 year ago

It looks like the --root approach can create some very large and sprawling directories (5GB for a very simple build) so this is not appropriate for this use case, but I'll find another way