rancher / rke2

https://docs.rke2.io/
Apache License 2.0
1.44k stars 256 forks source link

Installing RKE2 on ARM64, getting containerd for AMD64 - exec/format error #4954

Closed pascal71 closed 9 months ago

pascal71 commented 9 months ago

Environmental Info: RKE2 Version:

rke2 version v1.28.1+rke2r1 (4cc154f0e632a399094bb9843175f66670242ad6) go version go1.20.7 X:boringcrypto

Node(s) CPU architecture, OS, and Version:

1 masternode, ARM64, Ubuntu 22.04

Cluster Configuration:

1 master (not yet up-n-running)

Describe the bug:

systemctl enable rke2-server enable doesn't come up:

Oct 25 14:03:53 rock5b-n09 rke2[82477]: containerd: fork/exec /var/lib/rancher/rke2/data/v1.28.1-rke2r1-bf0a82f35177/bin/containerd: exec format error

Steps To Reproduce:

set -x HTTPS_PROXY=http://k8sc903lb01.spikweien08.nest:3128 curl -sfL https://get.rke2.io | sudo HTTPS_PROXY=http://k8sc903lb01.spikweien08.nest:3128 INSTALL_RKE2_VERSION="v1.28.2+rke2r1" sh -s -- --system-default-registry privreg-n01.spikweien08.nest

sudo mkdir -p /etc/rancher/rke2 sudo cp registries.yaml /etc/rancher/rke2

sudo systemctl enable rke2-server.service --now mkdir -p ~/.kube sudo cp /etc/rancher/rke2/rke2.yaml ~/.kube/config sudo chown ${USER}:${USER} ~/.kube/config echo "Agents can be joined with node-token:" sudo cat /var/lib/rancher/rke2/server/node-token echo

This btw works on x86-64 nodes

Expected behavior:

RKE2 nicely installs on ARM64 architectures

Actual behavior:

RKE2 doesn' t install due containerd not running (exec/format error) as it' s not the intended ARM64 version of containerd

Additional context / logs:

pascal71 commented 9 months ago

Same issue on: 1.28.2

Oct 25 14:19:18 rock5b-n09 rke2[86109]: containerd: fork/exec /var/lib/rancher/rke2/data/v1.28.2-rke2r1-7a91505b83c3/bin/containerd: exec format error

brandond commented 9 months ago

--system-default-registry privreg-n01.spikweien08.nest Oct 25 14:19:18 rock5b-n09 rke2[86109]: containerd: fork/exec /var/lib/rancher/rke2/data/v1.28.2-rke2r1-7a91505b83c3/bin/containerd: exec format error

It sounds like someone pushed the amd64 image to your private registry, instead of pushing the multiarch manifest list. Please ensure that you are using tools that are platform-aware when copying the image to your private registry.

You can compare the output of inspecting the upstream tag, to the tag on your registry - what do you get from running skopeo inspect --raw docker://privreg-n01.spikweien08.nest/rancher/rke2-runtime:v1.28.2-rke2r1 ?

brandond@dev01:~$ skopeo inspect --raw docker://docker.io/rancher/rke2-runtime:v1.28.2-rke2r1
{
  "schemaVersion": 2,
  "mediaType": "application/vnd.docker.distribution.manifest.list.v2+json",
  "manifests": [
    {
      "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
      "size": 529,
      "digest": "sha256:5977c069171417b4f6ead44e0f6b3bae6b0d64a73a597a0f96ce5d254609839d",
      "platform": {
        "architecture": "amd64",
        "os": "linux"
      }
    },
    {
      "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
      "size": 528,
      "digest": "sha256:47408a98c2a688a3dcf32802adbe1f7960f541e18f6aba336cabe464868dce92",
      "platform": {
        "architecture": "arm64",
        "os": "linux"
      }
    },
    {
      "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
      "size": 949,
      "digest": "sha256:8a69d8ff0395dab57265c533b974ecef59ea704867932730b59a797d1be89b35",
      "platform": {
        "architecture": "amd64",
        "os": "windows"
      }
    },
    {
      "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
      "size": 528,
      "digest": "sha256:512b2b8949ad71943d627df24e409a0aa795c1b02431a439ab25356a554dcc1b",
      "platform": {
        "architecture": "s390x",
        "os": "linux"
      }
    }
  ]
pascal71 commented 9 months ago

I guess that someone is probably me with PODMAN. Let me see if Skopeo will not split the the different manifests.

pascal71 commented 9 months ago

and of course thank you for your analysis

pascal71 commented 9 months ago

Correct; it was in the copied images using PODMAN. Even with Skopeo, it doesn't work out of the box and you will need to specify the --all parameter with skopeo copy.

E.g:

sudo skopeo copy --all docker://docker.io/${dst_image} docker://$PRIVATE_REGISTRY/${dst_image}

Thanks for the support; will leave this for documentation for someone else running into this issue.. Next step: RISCV64