Closed ThomasVitale closed 2 years ago
Hi @ThomasVitale , thanks for opening this issue! Sorry I got back to it that late.
Can you please paste the logs here from the same command, but with --trace
appended so we can see the full trace log?
@ThomasVitale , any update on this?
FWIW, I switched the default mode back to the original --mode tools-node
, since the new direct
mode caused some issues, which may be related to yours.
Feel free to reopen if it's still present in the upcoming v5.4.0 release :+1:
@iwilltry42 I'm facing the same problem as mentioned above on Mac OS Apple M1 and k3d v5.4.1
@iwilltry42 I'm seeing this issue on ubuntu when upgrading from k3d 5.3.0 to 5.4.1.
I'm seeing output like:
INFO[0000] Importing image(s) into cluster '$CLUSTER_NAME'
INFO[0000] Starting new tools node...
INFO[0000] Starting Node 'k3d-$CLUSTER_NAME-tools'
INFO[0000] Saving 3 tarball(s) to shared image volume...
INFO[0000] Importing images into nodes...
INFO[0000] Importing images from tarball '/k3d/images/k3d-$CLUSTER_NAME-images-$IMAGE1.tar' into node 'k3d-$CLUSTER_NAME-server-0'...
INFO[0000] Importing images from tarball '/k3d/images/k3d-$CLUSTER_NAME-images-$IMAGE2.tar' into node 'k3d-$CLUSTER_NAME-server-0'...
INFO[0000] Importing images from tarball '/k3d/images/k3d-$CLUSTER_NAME-images-$IMAGE3.tar' into node 'k3d-$CLUSTER_NAME-server-0'...
ERRO[0001] failed to import images in node 'k3d-$CLUSTER_NAME-server-0': Exec process in node 'k3d-$CLUSTER_NAME-server-0' failed with exit code '1'
ERRO[0001] failed to import images in node 'k3d-$CLUSTER_NAME-server-0': Exec process in node 'k3d-$CLUSTER_NAME-server-0' failed with exit code '1'
ERRO[0001] failed to import images in node 'k3d-$CLUSTER_NAME-server-0': Exec process in node 'k3d-$CLUSTER_NAME-server-0' failed with exit code '1'
INFO[0001] Removing the tarball(s) from image volume...
INFO[0002] Removing k3d-tools node...
INFO[0003] Successfully imported image(s)
INFO[0003] Successfully imported 3 image(s) into 1 cluster(s)
One thing that jumps out: the code in importWithToolsNode()
that logs these failures when copying images swallows errors without returning them to the caller: https://github.com/k3d-io/k3d/blob/852df7786ab5a98b9ecd95e1b215d593cf9201d8/pkg/client/tools.go#L125-L127 which allows the "successfully imported images" result to get printed.
Compare that to this other k3d code to import images into clusters (rather than nodes) which fails if any individual image fails to import: https://github.com/k3d-io/k3d/blob/7b1b416c2298f1aa30950eaae1d2847140ee285a/cmd/image/imageImport.go#L75-L86
Or even this code in the same importWithToolsNode()
method that returns errors if any image fails to save: https://github.com/k3d-io/k3d/blob/852df7786ab5a98b9ecd95e1b215d593cf9201d8/pkg/client/tools.go#L114-L116
Should importWithToolsNode()
return an error whenever any image installations encounter an error? That would at least mean that image installation errors are reported as overall install errors: that doesn't fix the install errors themselves, but it seems more appropriate than treating them as successes.
I am also using m1 and k3d image import fails. :(
my k3d version is 5.4.1
.
$ k3d image import foo-my:latest -c local
.
.
.
TRAC[0002] Exec process '[./k3d-tools save-image -d /k3d/images/k3d-local-images-20220504000554.tar foo-my:c82a74353357d2f11f2d0a0543cbdd9367fcd0dd9f78b03cf6fa70cf11bbc3e2]' still running in node 'k3d-local-tools'.. sleeping for 1 second...
TRAC[0003] Exec process '[./k3d-tools save-image -d /k3d/images/k3d-local-images-20220504000554.tar foo-my:c82a74353357d2f11f2d0a0543cbdd9367fcd0dd9f78b03cf6fa70cf11bbc3e2]' still running in node 'k3d-local-tools'.. sleeping for 1 second...
.
.
.
DEBU[0029] Exec process in node 'k3d-local-tools' exited with '0'
INFO[0029] Importing images from tarball '/k3d/images/k3d-local-images-20220504000554.tar' into node 'k3d-local-server-0'...
DEBU[0029] Executing command '[ctr image import /k3d/images/k3d-local-images-20220504000554.tar]' in node 'k3d-local-server-0'
INFO[0029] Importing images from tarball '/k3d/images/k3d-local-images-20220504000554.tar' into node 'k3d-local-agent-0'...
.
.
.
TRAC[0029] Exec process '[ctr image import /k3d/images/k3d-local-images-20220504000554.tar]' still running in node 'k3d-local-server-0'.. sleeping for 1 second...
TRAC[0029] Exec process '[ctr image import /k3d/images/k3d-local-images-20220504000554.tar]' still running in node 'k3d-local-agent-0'.. sleeping for 1 second...
TRAC[0030] Exec process '[ctr image import /k3d/images/k3d-local-images-20220504000554.tar]' still running in node 'k3d-local-server-0'.. sleeping for 1 second...
TRAC[0030] Exec process '[ctr image import /k3d/images/k3d-local-images-20220504000554.tar]' still running in node 'k3d-local-agent-0'.. sleeping for 1 second...
TRAC[0031] Exec process '[ctr image import /k3d/images/k3d-local-images-20220504000554.tar]' still running in node 'k3d-local-server-0'.. sleeping for 1 second...
TRAC[0031] Exec process '[ctr image import /k3d/images/k3d-local-images-20220504000554.tar]' still running in node 'k3d-local-agent-0'.. sleeping for 1 second...
TRAC[0032] Exec process '[ctr image import /k3d/images/k3d-local-images-20220504000554.tar]' still running in node 'k3d-local-agent-0'.. sleeping for 1 second...
TRAC[0032] Exec process '[ctr image import /k3d/images/k3d-local-images-20220504000554.tar]' still running in node 'k3d-local-server-0'.. sleeping for 1 second...
TRAC[0033] Exec process '[ctr image import /k3d/images/k3d-local-images-20220504000554.tar]' still running in node 'k3d-local-agent-0'.. sleeping for 1 second...
TRAC[0033] Exec process '[ctr image import /k3d/images/k3d-local-images-20220504000554.tar]' still running in node 'k3d-local-server-0'.. sleeping for 1 second...
TRAC[0034] Exec process '[ctr image import /k3d/images/k3d-local-images-20220504000554.tar]' still running in node 'k3d-local-server-0'.. sleeping for 1 second...
TRAC[0034] Exec process '[ctr image import /k3d/images/k3d-local-images-20220504000554.tar]' still running in node 'k3d-local-agent-0'.. sleeping for 1 second...
.
.
.
ERRO[0035] failed to import images in node 'k3d-local-agent-0': Exec process in node 'k3d-local-agent-0' failed with exit code '1'
ERRO[0035] failed to import images in node 'k3d-local-server-0': Exec process in node 'k3d-local-server-0' failed with exit code '1'
INFO[0035] Removing the tarball(s) from image volume...
DEBU[0035] Executing command '[rm -f /k3d/images/k3d-local-images-20220504000554.tar]' in node 'k3d-local-tools'
TRAC[0035] Exec process '[rm -f /k3d/images/k3d-local-images-20220504000554.tar]' still running in node 'k3d-local-tools'.. sleeping for 1 second...
DEBU[0036] Exec process in node 'k3d-local-tools' exited with '0'
INFO[0036] Removing k3d-tools node...
DEBU[0036] Deleting node k3d-local-tools ...
TRAC[0036] [Docker] Deleted Container k3d-local-tools
INFO[0036] Successfully imported image(s)
INFO[0036] Successfully imported 1 image(s) into 1 cluster(s)
and then, when I deployed this image to k3d clsuter, I got error Error: failed to create containerd container: error unpacking image: failed to resolve rootfs: content digest sha256:c82a74353357d2f11f2d0a0543cbdd9367fcd0dd9f78b03cf6fa70cf11bbc3e2: not found
also my docker image architecture "Architecture": "amd64"
by building paketo (buildpack).
it's well running in docker desktop but just not work k3d cluster.
@ethanttbui , @CodingCanuck & @heesuk-ahn , please follow along in the new issue #1072
I have the same problem... Any update on this topic ?
To add some info, I have this happening to me and executed the following (some stuff altered for privacy):
k3d image import --cluster local --trace my-registry.me.com/app/app:2.8.3 --keep-tarball
The output I got was
unpacking my-registry.me.com/app/app:2.8.3 (sha256:9379e04f6e56bf94db2d35f429dbf98cdcf8150a719b98face34981cec3ec23b)...ctr: content digest sha256:889bf72e765011f62f49d586fa4e24d42a865b11a676e612b162c24e9448181b: not found
In my mind this at least points into the tarball not being correctly produced, so I decided to compare the output of k3d-tools save with something else.
./k3d-tools save-image -d /k3d/images/k3d-local-images-temp.tar my-registry.me.com/app/app:2.8.3
and unpacked the tar, that had the following:
0d1e54d7b02115dcb3090315577baaccb869554c3ed505c64a455e581e13a57e
1288696addccc4013c5bcf61c1b6c38128a7214a0942976792918b51912d90f7
2b5a543f4bcd9d542c2b6d47951f3ef1fbca037cb2a2b5bd45694dcd6a673b5b
2d13c582c25d67b636cd6289ec75b79e746c73c08c6d96d2deff17a4c55ea492.json
32264cb072bcfb1a6ededa41733450c4a2e72791419da04339c4b2b29931d76c
6c243dbf5e2a5d2dee49003e9be5438030648635f82872b48be52e651a41ff07
895c913e1e487191387c85cd2facafc8b1a94404ea2a036fb562db42150575ec
manifest.json
repositories
Saving the image directly in my mac and then copying it into the volume and manually importing (ctr image import) failed with the same error. Then I tried saving the image on my mac again but this time using the hash reference (e.g. docker save -o output.tar 2d13c582c25d
) and that was imported successfully.
When I inspected the hash-saved-image tar, did sha1sums on all the contents and then found only 2 differences comparing to the one saved by k3d:
"RepoTags":null
while the one in k3d save had "RepoTags":["my-registry.me.com/app/app:2.8.3"]
I then exec-ed into the tools node again, mde those 2 modifications (delete repositories and set RepoTags to null) and placed a tar of the outcome in /k3d/images/test1.tar.
Finally, I did docker exec -it k3d-local-server-0 ctr image import /k3d/images/test1.tar
and that succeeded.
Now, I haven't enough knowledge about the image format to understand why this is causing this to fail, but at least it seems to be the cause. Happy to provide more info or do more tests, this seems quite frequent on my end.
Edit:
Although ctr imports the image, when I run ctr i list
inside the server it prints an error message at the top:
ERRO[0000] failed resolving platform for image sha256:2d13c582c25d67b636cd6289ec75b79e746c73c08c6d96d2deff17a4c55ea492 error="content digest sha256:2d13c582c25d67b636cd6289ec75b79e746c73c08c6d96d2deff17a4c55ea492: not found"
REF
The image is amd64.
@iwilltry42 how do we reopen this? Still observing this on 5.4.6. Thakns
ran into this as well, thought i'd share some more findings.
very relevant fact here is that i'm running this on an M1 mac.
k3d version v5.4.4
k3s version v1.23.8-k3s1 (default)
/ # ctr version
Client:
Version: v1.5.13-k3s1
Revision:
Go version: go1.17.5
Server:
Version: v1.5.13-k3s1
Revision:
UUID: dc205011-e667-416a-9c8e-c7ba88eb82c8
fwiw i suspect this issue is related: https://github.com/containerd/containerd/issues/6441, particularly the in-depth explanation here (https://github.com/containerd/containerd/issues/6441#issuecomment-1098609359).
So import is matching both linux/amd64 and linux/386, but since the image was pulled and exported for only the linux/amd64 platform, import cannot find the necessary content for linux/386 platform.
Since there seems to be inconsistency between ctr import and ctr export as far as platforms goes, which would be correct? In this case, the image is exported for just linux/amd64 but import expects linux/amd64 and linux/386.
fixes were identified for this and merged to containerd
here (https://github.com/containerd/containerd/pull/6906) at the end of august and the beginning of november (https://github.com/containerd/containerd/pull/7615).
i tried updating k3d
to see if i could get a release of ctr
that includes the fixes -
k3d version v5.4.6
k3s version v1.24.4-k3s1 (default)
/ # ctr version
Client:
Version: v1.6.6-k3s1
Revision:
Go version: go1.18.1
Server:
Version: v1.6.6-k3s1
Revision:
UUID: 5afae041-3b63-41d7-8dc3-c011fcc6390d
unfortunately i still got the same errors after updating and following the steps outlined below:
{&ContainerStateWaiting{Reason:CreateContainerError,Message:failed to create containerd container: error unpacking image: failed to resolve rootfs: content digest sha256:767c499cb2f8d13b940afeb98edd0fe91505b0d6993a4820b0a1fc9a58a11cb2: not found,} nil nil}
it looks like containerd 1.6.6 was released in june, which would predate the above fixes so that makes sense then.
update: bumped the rancher/k3s
image version up in my cluster config file to https://github.com/k3s-io/k3s/releases/tag/v1.24.7%2Bk3s1 which includes containerd
v1.6.8-k3s1. i think the PRs including the fix for this would be covered by that, but it's possible i've misread that...either way, still no luck - k3d image import
doesn't work for this version either, and when doing it manually the digests still seem to be subject to the manifest mismatch failed to create containerd container: error unpacking image: failed to resolve rootfs
😢.
i was getting the same error messages as explained above. k3d image imports
would fail and/or attempts to deploy would return an image pull error
ERRO[0030] failed to import images in node 'k3d-local-agent-2': Exec process in node 'k3d-local-agent-2 failed with exit code '1'
INFO[0030] Removing k3d-tools node...
INFO[0030] Successfully imported image(s)
INFO[0030] Successfully imported 1 image(s) into 1 cluster(s)
given the aforementioned issue, i wanted to confirm whether the problem was in fact stemming from discrepancies in underlying architecture. first i ran an image import to ensure the tarball persists locally:
k3d image import --cluster local-cluster-name --trace gcr.io/private-regstry/app:2.8.3 --keep-tarball
i made a note of the logs that show the full ctr
cmd being run to import the image in the trace logs, i.e.
TRAC[0028] Exec process '[ctr image import /k3d/images/local-cluster-name-xxx-images-20221202162620.tar]' still running in node 'local-cluster-name-agent-2'.. sleeping for 1 second...
i then ran docker exec
to get a shell in the 'master node' container and cd
'd into /k3d/images
here, i re-ran the ctr image import
command but made sure to include the --all-platforms
flag.
from the help text:
--all-platforms imports content for all platforms, false by default
/k3d/images # ctr image import --all-platforms local-cluster-name-images-20221202162620.tar
unpacking gcr.io/xxxx/xxxx:(sha256:xxxxx)...done
unlike before, now it unpacks - little victories 😄...
i then ran ctr images list | grep 'your-image-name'
to confirm it unpacked. that, unfortunately, is where the success ends going this route. though you can kick off a deployment and get past the prior error(s) related to the image not being pull-able, you will then hit an error similar to the following:
- dev:pod/app-6dc6fc866-lm2w6: container xxx in error: &ContainerStateWaiting{Reason:CreateContainerError,Message:failed to create containerd container: error unpacking image: failed to resolve rootfs: content digest sha256:767c499cb2f8d13b940afeb98edd0fe91505b0d6993a4820b0a1fc9a58a11cb2: not found,}
when running ctr images list
to get the digest for the same problematic image, as you may have guessed the digests are different:
gcr.io/xxxx/xxx application/vnd.docker.distribution.manifest.v2+json sha256:19cbc744b63eb4b1447401c775dc33a09f141f09b9a0c29632e008ead05c8e43 493.0 MiB linux/amd64
the above error (failed to resolve rootfs: content digest
) is referenced in the following issues https://github.com/containerd/containerd/issues/1498 and https://github.com/containerd/containerd/pull/1506, which concern a feature long-released in containerd
for multi-arch unpacking. the problem here doesn't seem to be one of lack of support for unpacking multi-arch builds, though, more-so that the digests aren't matching.
i noticed ctr images import
has a --digests
flag:
--digests whether to create digest images (default: false)
so that seemed like the next thing to try...
since k3d image import
doesn't offer a way to include all-platforms
or digests
, i searched for whether there was a workaround for that. (credit to https://github.com/kubernetes-sigs/kind/issues/2402#issuecomment-1056734295 for pointing me in this direction)
before doing this it's necessary to delete the old attempts in the local registry if you tried manually unpacking it via
ctr
on the node, etc.
to import an image, run the following instead of k3d image import
docker save gcr.io/example-repo/image | docker exec --privileged -i k3d-local-server-node-example ctr --namespace=k8s.io images import --all-platforms --digests --snapshotter=overlayfs -
this is almost the same approach as what i was doing above, except i'm skipping the k3d image import
step and using docker save
and added the --digests
and --snapshotter
flags.
in theory, this ought to make your image(s) available and i suspect the above could even function as a workaround for some, but that assumes the source image has multi-platform builds available to be saved in the first place.
unfortunately, this was not the case for me as the image(s) in question don't currently have builds compatible with the underlying architecture of the k3d
'nodes'. if this is you as well, you can expect to get a back-to-square-one error of:
DEBU[0046] marking resource failed due to error code STATUSCHECK_IMAGE_PULL_ERR subtask=-1 task=Deploy
- dev:deployment/xxx: container xx is waiting to start: gcr.io/example-repo/x can't be pulled
the workaround to the workaround might be rebuilding the affected image(s) with --platform
flag, i.e. docker build --platform <whatever the output of $(uname -m) is on a k3d node>
, but i've not had an opportunity to try this yet.
What did you do
How was the cluster created?
k3d cluster create mycluster
What did you do afterwards?
k3d image import my-image:1.0 -c mycluster
k3d image import my-image:1.0 -c mycluster -m direct
What did you expect to happen
I expected the image to be loaded correctly, but it didn't. Initially, I considered whether it was a problem with the containerd CLI when loading arm64 images (similar to https://github.com/kubernetes-sigs/kind/issues/2549), but it fails consistently also with images built specifically for amd64.
Screenshots or terminal output
The same error is thrown running any of the previous commands.
The final message says the image has been imported correctly, even if an error is thrown. Therefore, I tried running the image as a Pod and it fails as follows.
Which OS & Architecture
Which version of
k3d
Which version of docker