opendatahub-io / notebooks

Notebook images for ODH
Apache License 2.0
15 stars 51 forks source link

ci: start podman.socket and pass it to trivy to avoid unnecessary pulls #605

Closed jiridanek closed 1 week ago

jiridanek commented 1 week ago

Followup on

This does not help appreciably with the scan runtime, but still it's an improvement in efficiency.

Description

When I try running

$ podman run -v /run/user/1000/podman/podman.sock:/var/run/podman/podman.sock --rm -it docker.io/aquasec/trivy image --podman-host /var/run/podman/podman.sock --image-src podman ghcr.io/jiridanek/notebooks/workbench-images:jupyter-datascience-anaconda-python-3.8-jd_speed_up_trivy_17c0717f5e8cfc1f464c2ccc5d9850844f4c1019

then trivy scans local image that I previously downloaded with podman pull

If I leave out the podman socket, then trivy downloads the image at the beginning

$ podman run --rm -it docker.io/aquasec/trivy image ghcr.io/jiridanek/notebooks/workbench-images:jupyter-datascience-anaconda-python-3.8-jd_speed_up_trivy_17c0717f5e8cfc1f464c2ccc5d9850844f4c1019

then trivy has to download the image anew.

Logically speaking, providing the podman socket should allow Trivy to run faster. But actually that's not the case from my testing.

How Has This Been Tested?

Here's it running, https://github.com/jiridanek/notebooks/actions/runs/9760779225/job/26941336383

Merge criteria:

caponetto commented 1 week ago

Not sure why I didn't see this error on my fork during my tests but this PR probably resolves it.

jiridanek commented 1 week ago

I guess I know. The trivy command is

   podman run --rm \
      -v $REPORT_FOLDER:/report \
      docker.io/aquasec/trivy:$TRIVY_VERSION \
        image \
        --scanners vuln,secret \
        --exit-code 0 --timeout 30m \
        --severity CRITICAL,HIGH \
        --format template --template "@/report/$REPORT_TEMPLATE" -o /report/$REPORT_FILE \
        $IMAGE_NAME

So, these three are present, because neither podman socket nor docker socket, nor containerd socket are mounted inside the trivy image

    * docker error: unable to inspect the image (ghcr.io/opendatahub-io/notebooks/workbench-images:base-ubi8-python-3.8-main_d7b743849145bdcdaf1ae0c211cdc581f63d80c6): Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
    * containerd error: containerd socket not found: /run/containerd/containerd.sock
    * podman error: unable to initialize Podman client: no podman socket found: stat podman/podman.sock: no such file or directory

and this one... actually, this one is a bit of a puzzle; I agree that my PR should solve this, because now trivy uses podman socket to get to the image that's already present in podman image store. But why trivy cannot get at the image in ghcr.io repository, I'm not sure. It should have the packages: write permission and the workflow logs into ghcr.io registry

* remote error: GET https://ghcr.io/token?scope=repository%3Aopendatahub-io%2Fnotebooks%2Fworkbench-images%3Apull&service=ghcr.io: UNAUTHORIZED: authentication required
jiridanek commented 1 week ago

It should have the packages: write permission and the workflow logs into ghcr.io registry

Got it! Trivy runs inside its own container, so it does not see the ~/.docker/config.json on the machine, where are the login creds. We'd need to mount that file in. Or do what I did here, mount in podman socket.

caponetto commented 1 week ago

It makes sense! The weird thing is that I'd expect to see the same errors on my fork. Anyway, let's merge it to fix tomorrow's report 😄

jiridanek commented 1 week ago

too late, one approver called out sick and the other one is mostly the approver of last instance when putting together a release

https://github.com/opendatahub-io/notebooks/blob/d7b743849145bdcdaf1ae0c211cdc581f63d80c6/OWNERS#L1-L3

unless.... we just slap the approved label on this in the github webui, bypassing the prow process

openshift-ci[bot] commented 1 week ago

[APPROVALNOTIFIER] This PR is APPROVED

Approval requirements bypassed by manually added approval.

This pull-request has been approved by: caponetto, jstourac

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - **[OWNERS](https://github.com/opendatahub-io/notebooks/blob/main/OWNERS)** Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment
jiridanek commented 1 week ago

The weird thing is that I'd expect to see the same errors on my fork.

It's not weird. The opendatahub-io ghcr.io repository is private and needs auth to access because that's the org policy setup in github. The default is for these registries to be public so since you did not configured it on your fork, it defaulted to public.

caponetto commented 1 week ago

Mystery solved. I didn't know the registry was private. Thanks for the info.