Open cc83851 opened 3 years ago
For me this fails with podman and docker as well.
$ opm index add --build-tool docker --bundles quay.io/erdii/empty-catalogsource-bundle:0.0.1 --tag quay.io/erdii/empty-catalogsource-index:0.0.1
INFO[0000] building the index bundles="[quay.io/erdii/empty-catalogsource-bundle:0.0.1]"
INFO[0001] Could not find optional dependencies file dir=bundle_tmp419257694 file=bundle_tmp419257694/metadata load=annotations
ERRO[0001] permissive mode disabled bundles="[quay.io/erdii/empty-catalogsource-bundle:0.0.1]" error="unable to read directory bundle_tmp419257694/manifests: readdirent bundle_tmp419257694/manifests: not a directory"
Error: unable to read directory bundle_tmp419257694/manifests: readdirent bundle_tmp419257694/manifests: not a directory
Usage:
opm index add [flags]
...
opm index add --build-tool podman --bundles quay.io/erdii/empty-catalogsource-bundle:0.0.1 --tag quay.io/erdii/empty-catalogsource-index:0.0.1
INFO[0000] building the index bundles="[quay.io/erdii/empty-catalogsource-bundle:0.0.1]"
INFO[0001] Could not find optional dependencies file dir=bundle_tmp881236749 file=bundle_tmp881236749/metadata load=annotations
ERRO[0001] permissive mode disabled bundles="[quay.io/erdii/empty-catalogsource-bundle:0.0.1]" error="unable to read directory bundle_tmp881236749/manifests: readdirent bundle_tmp881236749/manifests: not a directory"
Error: unable to read directory bundle_tmp881236749/manifests: readdirent bundle_tmp881236749/manifests: not a directory
Usage:
opm index add [flags]
...
The bundle image is available on quay though:
$ podman pull quay.io/erdii/empty-catalogsource-bundle:0.0.1Trying to pull quay.io/erdii/empty-catalogsource-bundle:0.0.1...
Getting image source signatures
Copying blob 726b914825f6 skipped: already exists
Copying blob 7117aec6093a [--------------------------------------] 0.0b / 0.0b
Copying config 30460cf837 done
Writing manifest to image destination
Storing signatures
30460cf83718b125f17e167758c1136a39bad346f3ca9555f936c8745f766ef7
Edit: forgot this:
$ opm version
Version: version.Version{OpmVersion:"v1.17.1", GitCommit:"42d8e42", BuildDate:"2021-05-18T20:21:11Z", GoOs:"linux", GoArch:"amd64"}
Edit2: THE PROBLEM WAS ON MY SIDE! Explained in #668 maybe this is the root cause here as well? (Even though that should fail on docker as well)
Issue summary
“opm index add” works as expected with docker, but always fails with podman as it is unable to locate the bundle metadata directory. Note that the same error occurs with “opm registry add”. Version in use is opm 1.15.3.
The issue appears to be that a file that should be there is not when the open call is made, however the clean-up routine successfully deletes the file that could not be found. The most likely reason for this disparity is the current working directory is different when trying to open the directory compared when the clean-up is triggered.
As the directory/file opens are all relative opens if the current working directory is incorrect then the directory won’t be found. It is therefore possible that the directory open and the clean-up run from different places in the file tree which would explain the reason that the directory can’t be opened but the clean-up can delete the directory.
An alternative could be related to a timing issue within the file system where the directory does not appear until after the first access is made but I think that this is less likely than the “open” simply being invoked from the wrong directory when running the open.
Unfortunately our golang code knowledge is not up to following the working directory in the “from” variable as it changes as we go through the various function calls. However, it’s interesting that the “from” address used to determine the base of the temp directory is set differently when an overwrite is performed than when a new index is created. This could be a possible source of the issue (around https://github.com/operator-framework/operator-registry/blob/92a4c92168862d509841cef6fa949a1b6dec1ec9/pkg/registry/populator.go#L49 ).
Whatever the reason the file that is shown to be missing is definitely there as proven by the successful delete in the clean-up routine.
Command run:
$ opm index add --bundles docker-cto-dev-local.artifactrepository.citigroup.net/cto-gfts-iscs-174063/memcached-bundle:v0.0.1 --tag docker-cto-dev-local.artifactrepository.citigroup.net/cto-gfts-iscs-174063/catalog-index:v0.0.1 --container-tool podman --debug
INFO[0000] building the index bundles="[docker-cto-dev-local.artifactrepository.citigroup.net/cto-gfts-iscs-174063/memcached-bundle:v0.0.1]"
INFO[0000] running /usr/bin/podman pull docker-cto-dev-local.artifactrepository.citigroup.net/cto-gfts-iscs-174063/memcached-bundle:v0.0.1 bundles="[docker-cto-dev-local.artifactrepository.citigroup.net/cto-gfts-iscs-174063/memcached-bundle:v0.0.1]"
INFO[0002] running podman create bundles="[docker-cto-dev-local.artifactrepository.citigroup.net/cto-gfts-iscs-174063/memcached-bundle:v0.0.1]"
DEBU[0002] [podman create docker-cto-dev-local.artifactrepository.citigroup.net/cto-gfts-iscs-174063/memcached-bundle:v0.0.1 ] bundles="[docker-cto-dev-local.artifactrepository.citigroup.net/cto-gfts-iscs-174063/memcached-bundle:v0.0.1]"
INFO[0002] running podman cp bundles="[docker-cto-dev-local.artifactrepository.citigroup.net/cto-gfts-iscs-174063/memcached-bundle:v0.0.1]"
DEBU[0002] [podman cp 6040f4da6cd5907e119fc929c22bf5c69a049edd9be5b30c7e3ca5d870abec94:/. bundle_tmp091974567] bundles="[docker-cto-dev-local.artifactrepository.citigroup.net/cto-gfts-iscs-174063/memcached-bundle:v0.0.1]"
INFO[0002] running podman rm bundles="[docker-cto-dev-local.artifactrepository.citigroup.net/cto-gfts-iscs-174063/memcached-bundle:v0.0.1]"
DEBU[0002] [podman rm 6040f4da6cd5907e119fc929c22bf5c69a049edd9be5b30c7e3ca5d870abec94] bundles="[docker-cto-dev-local.artifactrepository.citigroup.net/cto-gfts-iscs-174063/memcached-bundle:v0.0.1]"
DEBU[0002] unable to populate database: unable to read directory bundle_tmp091974567/metadata: open bundle_tmp091974567/metadata: no such file or directory bundles="[docker-cto-dev-local.artifactrepository.citigroup.net/cto-gfts-iscs-174063/memcached-bundle:v0.0.1]"
ERRO[0002] permissive mode disabled bundles="[docker-cto-dev-local.artifactrepository.citigroup.net/cto-gfts-iscs-174063/memcached-bundle:v0.0.1]" error="unable to read directory bundle_tmp091974567/metadata: open bundle_tmp091974567/metadata: no such file or directory"
DEBU[0002] unable to add bundle to registry bundles="[docker-cto-dev-local.artifactrepository.citigroup.net/cto-gfts-iscs-174063/memcached-bundle:v0.0.1]" error="unable to read directory bundle_tmp091974567/metadata: open bundle_tmp091974567/metadata: no such file or directory"
Error: unable to read directory bundle_tmp091974567/metadata: open bundle_tmp091974567/metadata: no such file or directory
Usage:
opm index add [flags]
Code path:
https://github.com/operator-framework/operator-registry/blob/92a4c92168862d509841cef6fa949a1b6dec1ec9/cmd/opm/index/add.go#L164
*** Main “index add” code entry point
func runIndexAddCmdFunc(cmd *cobra.Command, args []string) error {
}
Calls AddToIndex in
https://github.com/operator-framework/operator-registry/blob/92a4c92168862d509841cef6fa949a1b6dec1ec9/pkg/lib/indexer/indexer.go#L95
// AddToIndex is an aggregate API used to generate a registry index image with additional bundles
func (i ImageIndexer) AddToIndex(request AddToIndexRequest) error {
Calls AddToRegistry in
https://github.com/operator-framework/operator-registry/blob/92a4c92168862d509841cef6fa949a1b6dec1ec9/pkg/lib/registry/registry.go#L36
func (r RegistryUpdater) AddToRegistry(request AddToRegistryRequest) error {
…
AddToRegistry calls populate (in same go file). (https://github.com/operator-framework/operator-registry/blob/92a4c92168862d509841cef6fa949a1b6dec1ec9/pkg/lib/registry/registry.go#L127
)
func populate(ctx context.Context, loader registry.Load, graphLoader registry.GraphLoader, querier registry.Query, reg image.Registry, refs []image.Reference, mode registry.Mode, overwrite bool) error {
…
}
However as requests.Overwrite is False it runs the unpackImage code and falls through and calls NewDirectoryPopulator from the end of populate.
https://github.com/operator-framework/operator-registry/blob/92a4c92168862d509841cef6fa949a1b6dec1ec9/pkg/registry/populator.go#L35
func NewDirectoryPopulator(loader Load, graphLoader GraphLoader, querier Query, imageDirMap map[image.Reference]string, overwriteDirMap map[string]map[image.Reference]string, overwrite bool) *DirectoryPopulator {
}
This in turn returns a DirectoryPopulator structure back to populate and “populate” executes “Populate” (with a capital “P”) https://github.com/operator-framework/operator-registry/blob/92a4c92168862d509841cef6fa949a1b6dec1ec9/pkg/registry/populator.go#L46
func (i *DirectoryPopulator) Populate(mode Mode) error {
}
Calls NewImageInput in
https://github.com/operator-framework/operator-registry/blob/92a4c92168862d509841cef6fa949a1b6dec1ec9/pkg/registry/imageinput.go#L24
Error at line 33, highlighted below, only place in source that this string appears and is the diagnostic error.
func NewImageInput(to image.Reference, from string) (*ImageInput, error) {
…
Code path explains all error and debug statements found in the output other than docker command debug statements.
The code that triggers the podman pull and cp commands can be found in the “unpackImage” function https://github.com/operator-framework/operator-registry/blob/92a4c92168862d509841cef6fa949a1b6dec1ec9/pkg/lib/registry/registry.go#L115
This triggers a pull and unpack from https://github.com/operator-framework/operator-registry/blob/92a4c92168862d509841cef6fa949a1b6dec1ec9/pkg/image/execregistry/registry.go#L37 the unpack copies the docker container into the temporary directory nominated.
We now look at the actual system calls and return status value from strace monitoring of the command and all sub-processes created.
Below is an extract from the strace logs focusing on the following system calls:
· “openat” – Opens file relative to current working directory
· “unlinkat” – Unlinnk (delete) file relative to current working directory
· “getdents64” – Read directory contents
The highlighted entries show access to files in general and the metadata directory specifically. Access to the metadata directory fails immediately but the recursive delete which includes the metadata directory succeeds.
openat(AT_FDCWD, "bundle_tmp432811490/metadata", O_RDONLY|O_CLOEXEC) = -1 ENOENT
(No such file or directory)
unlinkat(AT_FDCWD, "bundle_tmp432811490", 0) = -1 EISDIR (Is a directory)
unlinkat(AT_FDCWD, "bundle_tmp432811490", AT_REMOVEDIR) = -1 ENOTEMPTY (Director
y not empty)
openat(AT_FDCWD, ".", O_RDONLY|O_CLOEXEC) = 8
unlinkat(8, "bundle_tmp432811490", 0) = -1 EISDIR (Is a directory)
openat(8, "bundle_tmp432811490", O_RDONLY|O_CLOEXEC) = 9
getdents64(9, [{d_ino=85343721, d_off=10, d_reclen=24, d_type=DT_DIR, d_name="."
}, {d_ino=25916239, d_off=12, d_reclen=24, d_type=DT_DIR, d_name=".."}, {d_ino=4
2601538, d_off=512, d_reclen=32, d_type=DT_DIR, d_name="merged"}], 8192) = 80
getdents64(9, [], 8192) = 0
unlinkat(9, "merged", 0) = -1 EISDIR (Is a directory)
openat(9, "merged", O_RDONLY|O_CLOEXEC) = 10
getdents64(10, [{d_ino=42601538, d_off=10, d_reclen=24, d_type=DT_DIR, d_name=".
"}, {d_ino=85343721, d_off=12, d_reclen=24, d_type=DT_DIR, d_name=".."}, {d_ino=
59364218, d_off=15, d_reclen=32, d_type=DT_DIR, d_name="manifests"}, {d_ino=6782
4882, d_off=18, d_reclen=32, d_type=DT_DIR, d_name="metadata"}, {d_ino=77097626,
d_off=512, d_reclen=32, d_type=DT_DIR, d_name="tests"}], 8192) = 144
getdents64(10, [], 8192) = 0
unlinkat(10, "manifests", 0) = -1 EISDIR (Is a directory)
openat(10, "manifests", O_RDONLY|O_CLOEXEC) = 11
getdents64(11, [{d_ino=59364218, d_off=10, d_reclen=24, d_type=DT_DIR, d_name=".
"}, {d_ino=42601538, d_off=12, d_reclen=24, d_type=DT_DIR, d_name=".."}, {d_ino=
59364222, d_off=18, d_reclen=56, d_type=DT_REG, d_name="cache.example.com_memcac
heds.yaml"}, {d_ino=60093215, d_off=29, d_reclen=96, d_type=DT_REG, d_name="memc
ached-operator-controller-manager-metrics-service_v1_service.yaml"}, {d_ino=5937
0976, d_off=41, d_reclen=104, d_type=DT_REG, d_name="memcached-operator-metrics-
reader_rbac.authorization.k8s.io_v1_clusterrole.yaml"}, {d_ino=59370977, d_off=5
12, d_reclen=72, d_type=DT_REG, d_name="memcached-operator.clusterserviceversion
.yaml"}], 8192) = 376
getdents64(11, [], 8192) = 0
unlinkat(11, "cache.example.com_memcacheds.yaml", 0) = 0
unlinkat(11, "memcached-operator-controller-manager-metrics-service_v1_service.y
aml", 0) = 0
unlinkat(11, "memcached-operator-metrics-reader_rbac.authorization.k8s.io_v1_clu
sterrole.yaml", 0) = 0
unlinkat(11, "memcached-operator.clusterserviceversion.yaml", 0) = 0
unlinkat(10, "manifests", AT_REMOVEDIR) = 0
unlinkat(10, "metadata", 0) = -1 EISDIR (Is a directory)
openat(10, "metadata", O_RDONLY|O_CLOEXEC) = 11
getdents64(11, [{d_ino=67824882, d_off=10, d_reclen=24, d_type=DT_DIR, d_name=".
"}, {d_ino=42601538, d_off=12, d_reclen=24, d_type=DT_DIR, d_name=".."}, {d_ino=
67824891, d_off=512, d_reclen=40, d_type=DT_REG, d_name="annotations.yaml"}], 81
92) = 88
getdents64(11, [], 8192) = 0
unlinkat(11, "annotations.yaml", 0) = 0
unlinkat(10, "metadata", AT_REMOVEDIR) = 0
unlinkat(10, "tests", 0) = -1 EISDIR (Is a directory)
openat(10, "tests", O_RDONLY|O_CLOEXEC) = 11
getdents64(11, [{d_ino=77097626, d_off=10, d_reclen=24, d_type=DT_DIR, d_name=".
"}, {d_ino=42601538, d_off=12, d_reclen=24, d_type=DT_DIR, d_name=".."}, {d_ino=
85343725, d_off=512, d_reclen=32, d_type=DT_DIR, d_name="scorecard"}], 8192) = 8
0
getdents64(11, [], 8192) = 0
unlinkat(11, "scorecard", 0) = -1 EISDIR (Is a directory)
openat(11, "scorecard", O_RDONLY|O_CLOEXEC) = 12
getdents64(12, [{d_ino=85343725, d_off=10, d_reclen=24, d_type=DT_DIR, d_name=".
"}, {d_ino=77097626, d_off=12, d_reclen=24, d_type=DT_DIR, d_name=".."}, {d_ino=
85343728, d_off=512, d_reclen=32, d_type=DT_REG, d_name="config.yaml"}], 8192) =
80
getdents64(12, [], 8192) = 0
unlinkat(12, "config.yaml", 0) = 0
unlinkat(11, "scorecard", AT_REMOVEDIR) = 0
unlinkat(10, "tests", AT_REMOVEDIR) = 0
unlinkat(9, "merged", AT_REMOVEDIR) = 0
unlinkat(8, "bundle_tmp432811490", AT_REMOVEDIR) = 0
unlinkat(AT_FDCWD, "index_build_tmp268019669", 0) = -1 EISDIR (Is a directory)
unlinkat(AT_FDCWD, "index_build_tmp268019669", AT_REMOVEDIR) = -1 ENOTEMPTY (Dir
ectory not empty)
openat(AT_FDCWD, ".", O_RDONLY|O_CLOEXEC) = 7
unlinkat(7, "index_build_tmp268019669", 0) = -1 EISDIR (Is a directory)
openat(7, "index_build_tmp268019669", O_RDONLY|O_CLOEXEC) = 8
getdents64(8, [{d_ino=59085474, d_off=10, d_reclen=24, d_type=DT_DIR, d_name="."
}, {d_ino=25916239, d_off=12, d_reclen=24, d_type=DT_DIR, d_name=".."}, {d_ino=7
7097622, d_off=512, d_reclen=32, d_type=DT_DIR, d_name="database"}], 8192) = 80
getdents64(8, [], 8192) = 0
unlinkat(8, "database", 0) = -1 EISDIR (Is a directory)
openat(8, "database", O_RDONLY|O_CLOEXEC) = 9
getdents64(9, [{d_ino=77097622, d_off=10, d_reclen=24, d_type=DT_DIR, d_name="."
}, {d_ino=59085474, d_off=12, d_reclen=24, d_type=DT_DIR, d_name=".."}, {d_ino=7
7097624, d_off=512, d_reclen=32, d_type=DT_REG, d_name="index.db"}], 8192) = 80
getdents64(9, [], 8192) = 0
unlinkat(9, "index.db", 0) = 0
unlinkat(8, "database", AT_REMOVEDIR) = 0
unlinkat(7, "index_build_tmp268019669", AT_REMOVEDIR) = 0
unlinkat(AT_FDCWD, "index.Dockerfile905443632", 0) = 0
Steps
· Tries to open directory “bundle_tmp432811490/metadata” but fails with “file not found”, see https://github.com/operator-framework/operator-registry/blob/92a4c92168862d509841cef6fa949a1b6dec1ec9/pkg/registry/imageinput.go#L31
· Does a recursive delete on bundle_tmp432811490 (error causes cascade of returns back to populate function which has created a “defer cleanup()” for when the populate call returns. This is defined within unpackImage found at https://github.com/operator-framework/operator-registry/blob/92a4c92168862d509841cef6fa949a1b6dec1ec9/pkg/lib/registry/registry.go#L115 )
· Following recursive delete we delete the following files/directories under “bundle_tmp432811490”:
o merged
o manifests
§ cache.example.com_memcacheds.yaml
§ memcached-operator-controller-manager-metrics-service_v1_service.yaml
§ memcached-operator-metrics-reader_rbac.authorization.k8s.io_v1_clusterrole.yaml
§ memcached-operator.clusterserviceversion.yaml
o metadata
§ annotations.yaml