siderolabs / talos

Talos Linux is a modern Linux distribution built for Kubernetes.
https://www.talos.dev
Mozilla Public License 2.0
6.86k stars 549 forks source link

v1.8.1 loaded ARM image instead of AMD for rook/ceph:v1.15.5 #9711

Open muhlba91 opened 2 days ago

muhlba91 commented 2 days ago

Bug Report

Description

I updated Rook Ceph from v1.15.4 to v1.15.5 and the operator is not stuck in a CrashLoopBackOff due to exec /usr/local/bin/rook: exec format error. The operator was installed via the Helm chart.

Now, the image used is docker.io/rook/ceph@sha256:b94b23ecaf32e656c2460e2420dbc328be2951c673e41b7d3f2e3fe1eddecb59, which is a multi-arch image (the same as with v1.15.4).

I suppose that Talos loaded the "wrong" architecture onto the system and tries to run an ARM image on my Intel CPU.

Looking at the CLI commands, I couldn't find a way to delete an image, hoping it would download the correct architecture.

Can I re-trigger a download of an image? All other multi-arch images are working fine.

(FYI: I have also created a bug with rook but this does not seem to be related to rook itself as I seem to be the only one affected: https://github.com/rook/rook/issues/14988)

Logs

N/A

Environment

smira commented 2 days ago

Talos Linux doesn't pull the images either, it's done by the kubelet/CRI plugin.

I saw some similar issues, even though they seem to be quite random. You can try changing the image ref to a specific sha for the arch by using crane manifest on the multi-arch image.