kata-containers / kata-containers

Kata Containers is an open source project and community working to build a standard implementation of lightweight Virtual Machines (VMs) that feel and perform like containers, but provide the workload isolation and security advantages of VMs. https://katacontainers.io/
Apache License 2.0
5.09k stars 1.01k forks source link

kata-deploy failed to install runtime class for kata-fc #6945

Open wenzhaojie opened 1 year ago

wenzhaojie commented 1 year ago

Get your issue reviewed faster

To help us understand the problem more quickly, please do the following:

  1. Run the kata-collect-data.sh script, which is installed as part of Kata Containers or kata-containers.collect-data, which is installed as part of the Kata Containers snapcraft package.
    $ sudo kata-collect-data.sh > /tmp/kata.log

    or

    $ sudo kata-containers.collect-data > /tmp/kata.log
  2. Review the output file (/tmp/kata.log) to ensure it doesn't contain any private / sensitive information
  3. Paste the entire contents of the file into this issue as a comment (the script generates markdown format output).

The information provided will help us to understand the problem more quickly so saves time for both of us! :smile:

Description of problem

I follow the guide to install kata-runtime for my local kubernetes cluster https://github.com/kata-containers/kata-containers/blob/main/tools/packaging/kata-deploy/README.md, but with this guide it fails to install kata-fc runtime.

Expected result

Following the install guide, I can use different out-of-box kata-runtimes, including kata-fc.

Actual result

After kata-deploy is ready, I try different runtime-classes with sample workloads. However, the pod with kata-fc runtime is stucking in container creating status.

The pod description is as below:

Warning  FailedCreatePodSandBox  7m10s (x4597 over 20h)  kubelet  (combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create containerd task: failed to create shim task: failed to mount "/run/kata-containers/shared/containers/f45c5040f4c61cd50c9c7eea32e42827017528fba0cdc28de27a6b27ca9cefb0/rootfs" to "/run/kata-containers/f45c5040f4c61cd50c9c7eea32e42827017528fba0cdc28de27a6b27ca9cefb0/rootfs", with error: ENOENT: No such file or directory: unknown

Further information

I have found the issue, https://github.com/kata-containers/kata-containers/issues/6785 also mention the simmilar problem for the failure of install firecracker runtime. And I notice that https://github.com/kata-containers/kata-containers/blob/main/docs/how-to/how-to-use-kata-containers-with-firecracker.md introduce the steps to install firecracker runtime. But this guide is not mentioned in kata-deploy guide. So this makes me confused.

Is the documents out of the date? Thanks a lot!

fidencio commented 1 year ago

We need to document this better, but kata-fc will only work if you use the device-mapper snapshotter of containerd.

wenzhaojie commented 1 year ago

We need to document this better, but kata-fc will only work if you use the device-mapper snapshotter of containerd.

Thanks for your help. But the kata latest release binaries include firecracker with newer version. But the document say Kata Containers only support AWS Firecracker v0.23.4 (https://github.com/kata-containers/kata-containers/pull/1519).

fidencio commented 1 year ago

But the kata latest release binaries include firecracker with newer version.

The version that's shipped is tested, so it should work as long as you have device-mapper properly configured on your side.

alex-sainer commented 3 months ago

i just ran into the same problem - i deployed kata-deploy as described and configured the devicemapper as described here

but the kata-fcruntime does still not work -.-

when i run a container on that node with

ctr images pull --snapshotter devmapper docker.io/library/ubuntu:latest
sudo ctr run --snapshotter devmapper --runtime io.containerd.run.kata-fc.v2 -t --rm docker.io/library/ubuntu:latest ubuntu-kata-fc-test uname

the container starts as expected and to me it looks like everything works as expected

but when i deploy the kubernetes-example it does not work - the error i get there is:

Warning  FailedCreatePodSandBox  4m39s  kubelet            Failed to create pod sandbox: rpc error: code = NotFound desc = failed to create containerd container: error unpacking image: failed to extract layer sha256:59b1469b8fbd05fd256959ad9d7d776b9937b848d75113a0d7c1af442528b6d0: failed to get reader from content store: content digest sha256:0692f38991d53a0c28679148f99de26a44d630fda984b41f63c5e19f839d15a6: not found

any ideas what i'm doing wrong?

every-breaking-wave commented 2 months ago

I met the same problem with @alex-sainer , while i was using k3s. when I use k3s ctr command to lauch container, it workes fine, but if I use kubectl to lauch, I got image

every-breaking-wave commented 2 months ago

@alex-sainer hey guy, I just sovled this problem by following https://github.com/kata-containers/kata-containers/issues/8764#issuecomment-1983085837 The reason was that I only edited[plugins."io.containerd.snapshotter.v1.devmapper"]but forgot to set snapshotter = "devmapper" in [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.kata-fc]

alex-sainer commented 2 months ago

@every-breaking-wave i've added the snapshotter = "devmapper"already and revalidated it just now, it's still not working.

which command(s) did you use to setup the devmapper / create the thinpool-devices? maybe i did there something wrong...

i'm using the following script:

#!/bin/bash
set -ex

DATA_DIR=/var/lib/containerd/io.containerd.snapshotter.v1.devmapper
POOL_NAME=containerd-pool

sudo mkdir -p ${DATA_DIR}

# Create data file
sudo touch "${DATA_DIR}/data"
sudo truncate -s 1000G "${DATA_DIR}/data"

# Create metadata file
sudo touch "${DATA_DIR}/meta"
sudo truncate -s 40G "${DATA_DIR}/meta"

# Allocate loop devices
DATA_DEV=$(sudo losetup --find --show "${DATA_DIR}/data")
META_DEV=$(sudo losetup --find --show "${DATA_DIR}/meta")

# Define thin-pool parameters.
# See https://www.kernel.org/doc/Documentation/device-mapper/thin-provisioning.txt for details.
SECTOR_SIZE=512
DATA_SIZE="$(sudo blockdev --getsize64 -q ${DATA_DEV})"
LENGTH_IN_SECTORS=$(bc <<< "${DATA_SIZE}/${SECTOR_SIZE}")
DATA_BLOCK_SIZE=128
LOW_WATER_MARK=32768

# Create a thin-pool device
sudo dmsetup create "${POOL_NAME}" \
    --table "0 ${LENGTH_IN_SECTORS} thin-pool ${META_DEV} ${DATA_DEV} ${DATA_BLOCK_SIZE} ${LOW_WATER_MARK}"

cat << EOF
#
# Add this to your config.toml configuration file and restart containerd daemon
#
[plugins."io.containerd.snapshotter.v1.devmapper"]
pool_name = "${POOL_NAME}"
root_path = "${DATA_DIR}"
base_image_size = "40GB"
EOF
every-breaking-wave commented 2 months ago

@alex-sainer I used the same one with you...

scottames commented 2 months ago

I've just run into the exact same things as you two, @every-breaking-wave & @alex-sainer.

Reproduced the same on both amd64 and arm64 bare-metal instances. Running on 1.29 EKS w/ EKS optimized Amazon AMIs.

I'm able to run the container (ubuntu, just as @alex-sainer is above) using the ctr run --snapshotter devmapper --runtime io.containerd.run.kata-fc.v2 command on the host shell. I've attempted to run the exact same container/command as a Kubernetes job, which results in the above error.