firecracker-microvm / firecracker-containerd

firecracker-containerd enables containerd to manage containers as Firecracker microVMs
Apache License 2.0
2.22k stars 182 forks source link

Failed to extract layer from example remote snapshotter #709

Open DavidBuzatu-Marian opened 2 years ago

DavidBuzatu-Marian commented 2 years ago

Hi! I'm David and I am interested in making use of the firecracker-containerd in the context of remote snapshots using the stargz snapshotter.

I have followed the setup provided for running containers using remote snapshots as described below, but came across an issue when creating the task for the stargz based image. Further details are provided below and I am most thankful for your time!

Setup details

System

$ uname -a
Linux node-1.dbuzatu-138770.faas-sched-pg0.cloudlab.umass.edu 5.4.0-100-generic 
#113-Ubuntu SMP Thu Feb 3 18:43:29 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
$ cat /etc/os-release 
NAME="Ubuntu"
VERSION="20.04 LTS (Focal Fossa)"
VERSION_CODENAME=focal
UBUNTU_CODENAME=focal

Installation

Installed main version of firecracker-containerd by following the steps from quickstart.

Using firecracker-containerd to start a Docker based container using the following script works as expected:

sudo firecracker-ctr --address /run/firecracker-containerd/containerd.sock \
  run \
  --snapshotter devmapper \
  --runtime aws.firecracker \
  --rm --tty --net-host \
  docker.io/library/busybox:latest busybox-test

Issue (UPDATED)

After following the remote snapshotter setup guide, running the command:

./remote-snapshotter ghcr.io/firecracker-microvm/firecracker-containerd/amazonlinux:latest-esgz

results in the following output:

Creating VM
Setting docker credential metadata
Pulling the image
Creating a container
Creating a task
failed to create shim task: failed to create task: rpc error: code = Unknown desc = OCI runtime create failed: unable to retrieve OCI runtime error (open /container/remoteSnapshotterDemo/log.json: no such file or directory): runc did not terminate successfully: exit status 2: unknown

Firecracker-containerd logs gist:

time="2022-11-11T16:42:32.581939242-05:00" level=debug msg="create VM request: VMID:\"vm1\" "
time="2022-11-11T16:42:32.582002348-05:00" level=debug msg="using namespace: vm1"
time="2022-11-11T16:42:32.582389912-05:00" level=info msg="successfully started shim (git commit: 24f1fcf99ebf6edcb94edd71a2affbcdae6b08e7)." runtime=aws.firecracker task_id=remoteSnapshotterDemo vmID=vm1
time="2022-11-11T16:42:32.584471369-05:00" level=debug msg="creating task" bundle=/run/firecracker-containerd/io.containerd.runtime.v2.task/vm1/remoteSnapshotterDemo checkpoint= runtime=aws.firecracker stderr=/run/containerd/fifo/2512387503/remoteSnapshotterDemo-stderr stdin=/run/containerd/fifo/2512387503/remoteSnapshotterDemo-stdin stdout=/run/containerd/fifo/2512387503/remoteSnapshotterDemo-stdout task_id=remoteSnapshotterDemo terminal=false vmID=vm1
time="2022-11-11T16:42:32.584717937-05:00" level=debug msg="noop operation returning shim dir for JailPath" jailer=noop runtime=aws.firecracker vmID=vm1
time="2022-11-11T16:42:32.642205537-05:00" level=debug msg="[    5.161394] agent[796]: time=\"2022-11-11T21:42:32Z\" level=info msg=create ExecID= TaskID=remoteSnapshotterDemo" jailer=noop runtime=aws.firecracker vmID=vm1 vmm_stream=stdout
time="2022-11-11T16:42:32.686956668-05:00" level=debug ExecID= TaskID=remoteSnapshotterDemo attempt=1 direction=read error="temporary vsock dial failure: vsock ack message failure: failed to read \"OK <port>\" within 1s: EOF" runtime=aws.firecracker stream=stdout vmID=vm1
time="2022-11-11T16:42:32.687087921-05:00" level=debug ExecID= TaskID=remoteSnapshotterDemo attempt=1 direction=write error="temporary vsock dial failure: vsock ack message failure: failed to read \"OK <port>\" within 1s: EOF" runtime=aws.firecracker stream=stdin vmID=vm1
time="2022-11-11T16:42:32.687120563-05:00" level=debug ExecID= TaskID=remoteSnapshotterDemo attempt=1 direction=read error="temporary vsock dial failure: vsock ack message failure: failed to read \"OK <port>\" within 1s: EOF" runtime=aws.firecracker stream=stderr vmID=vm1
time="2022-11-11T16:42:32.694295404-05:00" level=debug msg="[    5.213658] systemd[1]: tmp-containerd\\x2dmount4037457222.mount: Succeeded." jailer=noop runtime=aws.firecracker vmID=vm1 vmm_stream=stdout
time="2022-11-11T16:42:32.786220731-05:00" level=debug msg="begin copying io" ExecID= TaskID=remoteSnapshotterDemo runtime=aws.firecracker stream=stdout vmID=vm1
time="2022-11-11T16:42:32.786710465-05:00" level=debug msg="begin copying io" ExecID= TaskID=remoteSnapshotterDemo runtime=aws.firecracker stream=stdin vmID=vm1
time="2022-11-11T16:42:32.786698280-05:00" level=debug msg="begin copying io" ExecID= TaskID=remoteSnapshotterDemo runtime=aws.firecracker stream=stderr vmID=vm1
time="2022-11-11T16:42:32.787360053-05:00" level=debug msg="[    5.306653] systemd[1]: container-remoteSnapshotterDemo-rootfs.mount: Succeeded." jailer=noop runtime=aws.firecracker vmID=vm1 vmm_stream=stdout
time="2022-11-11T16:42:32.793437365-05:00" level=error error="failed to create task: rpc error: code = Unknown desc = OCI runtime create failed: unable to retrieve OCI runtime error (open /container/remoteSnapshotterDemo/log.json: no such file or directory): runc did not terminate successfully: exit status 2" runtime=aws.firecracker task_id=remoteSnapshotterDemo vmID=vm1
time="2022-11-11T16:42:32.793522671-05:00" level=debug msg="copied 0" ExecID= TaskID=remoteSnapshotterDemo runtime=aws.firecracker stream=stdin vmID=vm1
Kern-- commented 2 years ago

From the firecracker-containerd output, it looks like you're having auth issues:

failed to fetch oauth token: unexpected status: 403 Forbidden

For GHCR, the docker username is your GitHub username and docker password is a personal access token. If you haven't created an access token, you can follow https://docs.github.com/en/packages/working-with-a-github-packages-registry/working-with-the-container-registry#authenticating-to-the-container-registry. If you have, can you verify that you're passing those correctly?

DavidBuzatu-Marian commented 2 years ago

Hi Kern! Ah, I was using my actual password for it. However, after inserting my credentials accordingly, I am now getting:

Creating VM
Setting docker credential metadata
Pulling the image
Creating a container
Creating a task
failed to create shim task: failed to create task: rpc error: code = Unknown desc = OCI runtime create failed: unable to retrieve OCI runtime error (open /container/remoteSnapshotterDemo/log.json: no such file or directory): runc did not terminate successfully: exit status 2: unknown

Is this part of the actual image or could this be related to a setup issue?

Kern-- commented 2 years ago

Do you have the firecracker-containerd logs from that most recent failure?

DavidBuzatu-Marian commented 2 years ago

Sure. I have added it as a file firecracker_output.txt

EDIT: Created a gist for the logs as that might be more useful: https://gist.github.com/DavidBuzatu-Marian/b9ae479a6d605ec42f4b03ea0800f998

DavidBuzatu-Marian commented 1 year ago

Hi @Kern--. Apologies for the disturbance, but I wanted to ask if there is any update you may give me on this issue. Any pointers as to what could be wrong would be useful. I did take my time to look into the IOProxy class as it seems to be the one delegated to perform the IO operations that seem to file, though I am not sure where the file remoteSnapshotterDemo/log.json: is expected to be, as that one is not found.