containerd / accelerated-container-image

A production-ready remote container image format (overlaybd) and snapshotter based on block-device.
Apache License 2.0
411 stars 76 forks source link

obdconverted image fails to run for me #76

Closed shahzzzam closed 3 years ago

shahzzzam commented 3 years ago

Hi,

I have been following the documentation to convert an OCI image to overlaybd friendly image based on https://github.com/alibaba/accelerated-container-image/blob/main/docs/EXAMPLES.md#convert-oci-image-into-overlaybd

But I get the following error when trying to run it. Note that instead of localhost:5000/redis:6.2.1_obd, I use myreg.azurecr.io/test/redis:6.2.1. It probably shouldn't make any difference?

ctr run --net-host --snapshotter=overlaybd --rm -t myreg.azurecr.io/test/redis:6.2.1 demo
ctr: failed to prepare extraction snapshot "extract-164412284-SC8e sha256:23e0fe431efc04eba59e21e54ec38109f73b5b5df355234afca317c0b32f7b0e": failed to attach and mount for snapshot 33: failed to mount /dev/sdh to /var/lib/overlaybd/snapshots/33/block/mountpoint: read-only file system: unknown

What should I check? The output

Environment:

root@agentpool1:/var/lib/waagent# ctr plugin ls | grep overlaybd
io.containerd.snapshotter.v1    overlaybd                -              ok

root@agentpool1:/var/lib/waagent# ctr snapshot --snapshotter overlaybd ls
KEY PARENT KIND

root@agentpool1:/var/lib/waagent# ctr images ls
REF                                         TYPE                                                      DIGEST                                                                  SIZE     PLATFORMS                                                                                               LABELS
myreg.azurecr.io/test/redis:6.2.1           application/vnd.docker.distribution.manifest.v2+json      sha256:d448b24bc45ae177ba279d04ea53ec09421dd5bee66b887d3106e0d380d6cc6b 65.0 MiB linux/amd64                                                                                             -
registry.hub.docker.com/library/redis:6.2.1 application/vnd.docker.distribution.manifest.list.v2+json sha256:08e282682a708eb7f51b473516be222fff0251cdee5ef8f99f4441a795c335b6 36.9 MiB linux/386,linux/amd64,linux/arm/v5,linux/arm/v7,linux/arm64/v8,linux/mips64le,linux/ppc64le,linux/s390x -
liulanzheng commented 3 years ago

I did not reproduce this failure, you may try to push the converted image to registry and use rpull and then run a container, to see whether it works. If it does not work, please provide your containerd and os version.

shahzzzam commented 3 years ago

Btw, the suggested way to push would be ctr image push -u "<creds>" myreg.azurecr.io/test/redis:6.2.1?

I tried to push and rpull as per your suggestion and it fails again with the following error

ctr: failed to attach and mount for snapshot 40: failed to mount /dev/sdc to /var/lib/overlaybd/snapshots/40/block/mountpoint: read-only file system: unknown

When tried to run this image using k8s, I get the following error:

error="failed to create containerd task: OCI runtime create failed: container_linux.go:380: starting container process caused: exec: \"docker-entrypoint.sh\ ": executable file not found in $PATH: unknown

It seems to me that the image that is converted is somehow corrupted.

Note that the same machine was successfully able to run the registry.hub.docker.com/overlaybd/redis:6.2.1_obd. So for whatever reason, converting it fails for me. What do you suggest?

containerd: v1.4.1 ("https://mobyartifacts.azureedge.net/moby/moby-containerd/1.4.1+azure/bionic/linux_amd64/moby-containerd_1.4.1+azure-1_amd64.deb") OS: Ubuntu 18.04.5 LTS (GNU/Linux 5.4.0-1055-azure x86_64)

shahzzzam commented 3 years ago

I used a normal containerd v1.4.1 and even just used localhost:5000 example on your page.

VERSION=1.4.1
wget https://github.com/containerd/containerd/releases/download/v${VERSION}/cri-containerd-cni-${VERSION}-linux-amd64.tar.gz

I suspect there might be low level library incompatible with azure linux kernel? Error:

ctr run --net-host --snapshotter=overlaybd --rm -t localhost:5000/redis:6.2.1_obd demo
ctr: failed to prepare extraction snapshot "extract-802112965-hKE9 sha256:efaff7faaadccc90305dc329108266c0a01483d205e58798ef0413bcaa6f674a": failed to attach and mount for snapshot 10: failed to mount /dev/sde to /var/lib/overlaybd/snapshots/10/block/mountpoint: read-only file system: unknown

Command History: image

liulanzheng commented 3 years ago

i will try azure linux, it may take a little longer

shahzzzam commented 3 years ago

Did you get a chance to try. I used the following commands if you want to reproduce the issue!:

You can provision following VM in Azure: Operating system: Linux (ubuntu 20.04) Size :Standard D2ds_v4 (2 vcpus, 8 GiB memory)

Please note, you need to turn anonymous auth for your registry here due to the following auth issue: https://github.com/alibaba/overlaybd/issues/58

You can do so by: using az

az acr update --name myregistry --anonymous-pull-enabled false
1  sudo apt update
2  sudo apt install -y pkg-config libgflags-dev libcurl4-openssl-dev libssl-dev libaio-dev libnl-3-dev libnl-genl-3-dev libglib2.0-dev
3  sudo apt install -y make cmake g++ gcc
4  wget https://github.com/google/googletest/archive/refs/tags/release-1.10.0.tar.gz
5  tar -zxvf release-1.10.0.tar.gz
6  cd googletest-release-1.10.0/
7  cmake CMakeLists.txt
8  make
9  sudo make install
10  cd ..
11  git clone https://github.com/alibaba/overlaybd.git
12  cd overlaybd
13  mkdir build
14  cd build
15  cmake .. -DCMAKE_BUILD_TYPE=Debug -DBUILD_TESTING=1
16  make -j8
17  sudo make install
18  sudo systemctl enable /opt/overlaybd/overlaybd-tcmu.service
19  sudo systemctl start overlaybd-tcmu
20  cd ../..
21  curl -OL https://golang.org/dl/go1.17.2.linux-amd64.tar.gz
22  sudo tar -C /usr/local/ -xvf go1.17.2.linux-amd64.tar.gz
23  sudo echo 'export PATH=$PATH:/usr/local/go/bin' >> ~/.profile
24  source ~/.profile
25  moby_runc_package_url=https://packages.microsoft.com/ubuntu/20.04/prod/pool/main/m/moby-runc/moby-runc_1.0.2%2Bazure-1_amd64.deb
26  moby_runc_package_file="./moby-runc.deb"
27  curl -sSL $moby_runc_package_url -o $moby_runc_package_file
28  sudo dpkg --force-all -i $moby_runc_package_file
29  moby_containerd_package_url=https://packages.microsoft.com/ubuntu/20.04/prod/pool/main/m/moby-containerd/moby-containerd_1.5.7%2Bazure-1_amd64.deb
30  moby_containerd_package_file="./moby-containerd.deb"
31  curl -sSL $moby_containerd_package_url -o $moby_containerd_package_file
32  sudo dpkg --force-all -i $moby_containerd_package_file
33  cd accelerated-container-image
34  make
35  sudo mkdir /etc/overlaybd-snapshotter
36  sudo cat <<-EOF | sudo tee /etc/overlaybd-snapshotter/config.json
    {
    "root": "/var/lib/containerd/io.containerd.snapshotter.v1.overlaybd",
    "address": "/run/overlaybd-snapshotter/overlaybd.sock"
    }
EOF

37  sudo mkdir /etc/containerd
38  sudo cat <<-EOF | sudo tee --append /etc/containerd/config.toml
    [proxy_plugins.overlaybd]
    type = "snapshot"
    address = "/run/overlaybd-snapshotter/overlaybd.sock"
EOF

39  sudo bin/overlaybd-snapshotter

(from another terminal)

1  sudo ctr content fetch registry.hub.docker.com/library/redis:6.2.1
2  sudo bin/ctr obdconv registry.hub.docker.com/library/redis:6.2.1 nonpe.azurecr.io/overlaybd/redis:1
3  sudo ctr image push nonpe.azurecr.io/overlaybd/redis:1 -u "nonpe:<password>"
4  sudo ctr image rm nonpe.azurecr.io/overlaybd/redis:1
5  sudo bin/ctr rpull nonpe.azurecr.io/overlaybd/redis:1
6  sudo ctr run --net-host --snapshotter=overlaybd --rm -t  nonpe.azurecr.io/overlaybd/redis:1 demo
ctr: failed to attach and mount for snapshot 14: failed to mount /dev/sdc to /var/lib/containerd/io.containerd.snapshotter.v1.overlaybd/snapshots/14/block/mountpoint: read-only file system: unknown
liulanzheng commented 3 years ago

@shahzzzam Thank you, i have reproduced the problem. i'm trying to fix it.

liulanzheng commented 3 years ago

@shahzzzam It is caused by data unsync after calling umount in converting progress, which raised data loss. Errors can be found in dmesg. I tried to add a sync after umount, converting worked well without errors in dmesg. But this is not a general solution for umount should complete all pending writes. So it maybe a bug under specific environment. It also can be fixed by apt upgrade, I saw several packages are upgraded but i'm not sure which one fix it.

liulanzheng commented 3 years ago

@shahzzzam I saw my kernel was update from 5.8.0-1042-azure to 5.8.0-1043-azure during apt upgrade. I think the bug is fixed in new kernel.