operator-framework / operator-registry

Operator Registry runs in a Kubernetes or OpenShift cluster to provide operator catalog data to Operator Lifecycle Manager.
Apache License 2.0
212 stars 248 forks source link

opm index prune fails on Mac opening file for writing #947

Open mhrivnak opened 2 years ago

mhrivnak commented 2 years ago

Versions

opm version:

Version: version.Version{OpmVersion:"b576eef43", GitCommit:"b576eef43e0d4c3745435a068b8bf42ec347eda3", BuildDate:"2022-04-11T10:00:51Z", GoOs:"darwin", GoArch:"amd64"}

podman version:

Client:       Podman Engine
Version:      4.0.3
API Version:  4.0.3
Go Version:   go1.18
Built:        Fri Apr  1 11:28:59 2022
OS/Arch:      darwin/arm64

Server:       Podman Engine
Version:      4.0.3
API Version:  4.0.3
Go Version:   go1.18
Built:        Fri Apr  1 14:22:39 2022
OS/Arch:      linux/arm64

Problem

I ran the following command to prune an index. It works on a linux box but fails on a Mac:

opm index prune --debug --from-index "registry.redhat.io/redhat/redhat-operator-index:v4.10" --packages 'advanced-cluster-management' --tag myindex:v4.10

It exits with code 125 and the following output:

WARN[0000] DEPRECATION NOTICE:
Sqlite-based catalogs and their related subcommands are deprecated. Support for
them will be removed in a future release. Please migrate your catalog workflows
to the new file-based catalog format. 
INFO[0000] pruning the index                             packages="[advanced-cluster-management]"
INFO[0000] Pulling previous image registry.redhat.io/redhat/redhat-operator-index:v4.10 to get metadata  packages="[advanced-cluster-management]"
INFO[0000] running /opt/homebrew/bin/podman pull registry.redhat.io/redhat/redhat-operator-index:v4.10  packages="[advanced-cluster-management]"
INFO[0002] running /opt/homebrew/bin/podman pull registry.redhat.io/redhat/redhat-operator-index:v4.10  packages="[advanced-cluster-management]"
INFO[0003] Getting label data from previous image        packages="[advanced-cluster-management]"
INFO[0003] running podman inspect                        packages="[advanced-cluster-management]"
DEBU[0003] [podman inspect registry.redhat.io/redhat/redhat-operator-index:v4.10]  packages="[advanced-cluster-management]"
INFO[0004] running podman create                         packages="[advanced-cluster-management]"
DEBU[0004] [podman create registry.redhat.io/redhat/redhat-operator-index:v4.10 ]  packages="[advanced-cluster-management]"
INFO[0004] running podman cp                             packages="[advanced-cluster-management]"
DEBU[0004] [podman cp 78bb5891bdcffae55004b2ebb0734b8319f314e94a6b72390b419ef3a5a361cc:/. ./index_tmp_377180214]  packages="[advanced-cluster-management]"
ERRO[0005] Error: 2 errors occurred:
    * error copying to host: copier: put: error creating "/Users/mhrivnak/index_tmp_377180214/root/.bash_logout": copier: put: error opening file "/Users/mhrivnak/index_tmp_377180214/root/.bash_logout" for writing: open /Users/mhrivnak/index_tmp_377180214/root/.bash_logout: permission denied
    * error copying from container: io: read/write on closed pipe  packages="[advanced-cluster-management]"
Error: error copying container directory Error: 2 errors occurred:
    * error copying to host: copier: put: error creating "/Users/mhrivnak/index_tmp_377180214/root/.bash_logout": copier: put: error opening file "/Users/mhrivnak/index_tmp_377180214/root/.bash_logout" for writing: open /Users/mhrivnak/index_tmp_377180214/root/.bash_logout: permission denied
    * error copying from container: io: read/write on closed pipe
: exit status 125
Usage:
  opm index prune [flags]

Flags:
  -i, --binary-image opm        container image for on-image opm command
  -c, --container-tool string   tool to interact with container images (save, build, etc.). One of: [docker, podman] (default "podman")
  -f, --from-index string       index to prune
      --generate                if enabled, just creates the dockerfile and saves it to local disk
  -h, --help                    help for prune
  -d, --out-dockerfile string   if generating the dockerfile, this flag is used to (optionally) specify a dockerfile name
  -p, --packages strings        comma separated list of packages to keep
      --permissive              allow registry load errors
  -t, --tag string              custom tag for container image being built

Global Flags:
      --skip-tls   skip TLS certificate verification for container image registries while pulling bundles or index
mhrivnak commented 2 years ago

In the files copied out of the container image by the podman cp command, I see the /root directory has these permissions:

$ ls -ld delme/root
dr-xr-x---. 3 mhrivnak mhrivnak 4096 Feb 24 06:08 delme/root

Note the lack of write permission. On linux it doesn't seem to matter; podman can create the files within that directory anyway. But on OSX it fails to create a file within that directory, presumably because of this issue. But I don't know why the behavior is different.

mhrivnak commented 2 years ago

Here is a reproducer using only the podman commands that get run by opm:

#!/usr/bin/env bash

podman pull registry.redhat.io/redhat/redhat-operator-index:v4.10

CONTAINERID="$(podman create registry.redhat.io/redhat/redhat-operator-index:v4.10)"

echo "--> Created new container $CONTAINERID"

OUTDIR="$(mktemp -d ./bug.XXXXXX)"

echo "--> Created temp dir $OUTDIR"

podman cp --log-level=trace "$CONTAINERID:/." "$OUTDIR/"
mhrivnak commented 2 years ago

This bug can be avoided if the Unpack behavior were to only copy out the database file from the image, and not copy out the entire contents of the container image's filesystem.

That would also be more efficient. Right now for example with registry.redhat.io/redhat/redhat-operator-index:v4.10, unpack writes 688MB to disk, whereas index.db is only 75MB.

Would that break anything if we change opm to only copy out the index file?

exdx commented 2 years ago

It shouldn't -- I attempted to do exactly that in #800 but couldn't figure out why tests were failing on it. It ended up deprioritized but I/someone else could pick it back up

exdx commented 2 years ago

Looks like this was fixed upstream in podman, which is great. I suppose we will still have this issue with docker-based opm prune commands though.

exdx commented 2 years ago

One other solution is to break up the pulling and unpacking tool for the opm prune command. This was done in opm index add to help alleviate some issues.

The proposed solution would be to implement a flag such as --pull-tool such that unpacking can be done via containerd, but building can be done with podman.

timflannagan commented 2 years ago

@joelanford @grokspawn Any idea on whether this is still an issue with recent registry releases?

MansM commented 2 years ago

@joelanford @grokspawn Any idea on whether this is still an issue with recent registry releases?

yes podman 4.2.1 opm Version: version.Version{OpmVersion:"5cfc4d643", GitCommit:"5cfc4d643f5fead6b02aa40a2a661b0fa64a2958", BuildDate:"2022-08-25T05:31:05Z", GoOs:"darwin", GoArch:"amd64" (4.10.34)