Closed rsmitty closed 10 months ago
It should be noted that with Talos, these system extensions run as a container and I'm mounting /dev, /var, and /run into this container.
@rsmitty How the mount propagation is configured? All mount events under /var/lib/containerd-stargz-grpc
needs to be shared to containerd's namespace. So something like rshared
is needed for the bind mount (docker example).
Hey @ktock, thanks for chiming in. Indeed, the mount for /var should be correct I believe. It's mounted with the following options:
- source: /var
destination: /var
type: bind
options:
- rshared
- rbind
- rw
Are there any other paths than /dev, /var, /run that need to be mounted up from the host?
Here's the containerd logs surrounding the launch of one of the containers. Doesn't really seem to help much other than the same path-related error and it doesn't give me the feeling that containerd isn't finding the image layers or anything of that nature:
192.168.1.111: {"level":"info","msg":"PullImage \"ghcr.io/stargz-containers/alpine:3.15.3-esgz\" returns image reference \"sha256:d087dacb46e24b2791f34f832582114a7309b0c2613c56d83e4e96d6d04b88a7\"","time":"2023-10-11T14:47:59.252936653Z"}
192.168.1.111: {"level":"info","msg":"CreateContainer within sandbox \"c9374e087cf0b85f7560acf58a0f7863032e72404f990551f00073753b0afbab\" for container \u0026ContainerMetadata{Name:ubu-esgz,Attempt:67,}","time":"2023-10-11T14:47:59.254881392Z"}
192.168.1.111: {"level":"info","msg":"CreateContainer within sandbox \"c9374e087cf0b85f7560acf58a0f7863032e72404f990551f00073753b0afbab\" for \u0026ContainerMetadata{Name:ubu-esgz,Attempt:67,} returns container id \"dcd65a7139709ee9a35e7b0fdc6f07e6b4e6491bb916e02b9f8de5e4f0f16134\"","time":"2023-10-11T14:47:59.575791738Z"}
192.168.1.111: {"level":"info","msg":"StartContainer for \"dcd65a7139709ee9a35e7b0fdc6f07e6b4e6491bb916e02b9f8de5e4f0f16134\"","time":"2023-10-11T14:47:59.576566385Z"}
192.168.1.111: {"id":"dcd65a7139709ee9a35e7b0fdc6f07e6b4e6491bb916e02b9f8de5e4f0f16134","level":"info","msg":"shim disconnected","time":"2023-10-11T14:47:59.659983960Z"}
192.168.1.111: {"id":"dcd65a7139709ee9a35e7b0fdc6f07e6b4e6491bb916e02b9f8de5e4f0f16134","level":"warning","msg":"cleaning up after shim disconnected","namespace":"k8s.io","time":"2023-10-11T14:47:59.660261183Z"}
192.168.1.111: {"level":"info","msg":"cleaning up dead shim","time":"2023-10-11T14:47:59.660339024Z"}
192.168.1.111: {"level":"warning","msg":"cleanup warnings time=\"2023-10-11T14:47:59Z\" level=info msg=\"starting signal loop\" namespace=k8s.io pid=24244 runtime=io.containerd.runc.v2\ntime=\"2023-10-11T14:47:59Z\" level=warning msg=\"failed to read init pid file\" error=\"open /run/containerd/io.containerd.runtime.v2.task/k8s.io/dcd65a7139709ee9a35e7b0fdc6f07e6b4e6491bb916e02b9f8de5e4f0f16134/init.pid: no such file or directory\" runtime=io.containerd.runc.v2\n","time":"2023-10-11T14:47:59.671494840Z"}
192.168.1.111: {"error":"read /proc/self/fd/131: file already closed","level":"error","msg":"copy shim log","time":"2023-10-11T14:47:59.671844583Z"}
192.168.1.111: {"error":"reading from a closed fifo","level":"error","msg":"Failed to pipe stdout of container \"dcd65a7139709ee9a35e7b0fdc6f07e6b4e6491bb916e02b9f8de5e4f0f16134\"","time":"2023-10-11T14:47:59.672125966Z"}
192.168.1.111: {"error":"reading from a closed fifo","level":"error","msg":"Failed to pipe stderr of container \"dcd65a7139709ee9a35e7b0fdc6f07e6b4e6491bb916e02b9f8de5e4f0f16134\"","time":"2023-10-11T14:47:59.672412169Z"}
192.168.1.111: {"error":"failed to create containerd task: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: exec: \"/bin/sh\": stat /bin/sh: no such file or directory: unknown","level":"error","msg":"StartContainer for \"dcd65a7139709ee9a35e7b0fdc6f07e6b4e6491bb916e02b9f8de5e4f0f16134\" failed","time":"2023-10-11T14:47:59.672990064Z"}
192.168.1.111: {"level":"info","msg":"RemoveContainer for \"f24af168123a416305cb037bc04c1d56d142ba7c861827057fef931e035c40b9\"","time":"2023-10-11T14:48:00.567123401Z"}
192.168.1.111: {"level":"info","msg":"RemoveContainer for \"f24af168123a416305cb037bc04c1d56d142ba7c861827057fef931e035c40b9\" returns successfully","time":"2023-10-11T14:48:00.569010259Z"}
@rsmitty Thanks for the information.
It's mounted with the following options:
What runtime is used with these options? containerd?
Are there any other paths than /dev, /var, /run that need to be mounted up from the host?
They should be enough.
I have some questions about the mounts:
After pulling an estargz image from the registry to the node, does mount | grep stargz
print some estargz FUSE mounts on both of the snapshotter container and the host? And are the contents under /var/lib/containerd-stargz-grpc/snapshotter/snapshots/*/fs/
visible from the host?
What runtime is used with these options? containerd?
Yes, containerd.
I have some questions about the mounts: After pulling an estargz image from the registry to the node, does
mount | grep stargz
print some estargz FUSE mounts on both of the snapshotter container and the host? And are the contents under/var/lib/containerd-stargz-grpc/snapshotter/snapshots/*/fs/
visible from the host?
That is a good question. On the host, I can see that there are contents under the fs directory for the snapshots (there's lots of them it appears). That said, I can't see any mounts when grepping for stargz on the host. Maybe I'm missing something there? Anything special I should have configured for fuse? I'm building that from source as well, as Talos doesn't have a package manager.
@rsmitty Thanks for the information.
I can't see any mounts when grepping for stargz on the host.
If FUSE mountpoints (e.g. mount | grep stargz
) are visible from the snapshotter container but are invisible from the host, then they don't seem to be propagated.
What does cat /proc/self/mountinfo
in the snapshotter container and on the host show about mount propagation relationship between the container's /var/
mountpoint and the host filesystem? It shows propagation information like shared
or master
, etc.
Any updates?
Okay, got some more time to hack on this today. I was able to see that, indeed, it seems to be a problem with mount propagation.
From the host, I can't see any mounts that are related to fuse.
From the stargz container, however, I can see:
192.168.1.111: 158 136 0:97 / /var/lib/containerd-stargz-grpc/snapshotter/snapshots/52/fs rw,nodev,relatime shared:56 - fuse.rawBridge stargz rw,user_id=0,group_id=0,allow_other
That said, I'm not quite sure what I'm missing here. /var is mounted into the container with rshared, rbind, and rw as mentioned above. My understanding of the mount docs is that this should propagate as expected. Any ideas on where I might be falling over at this point?
@rsmitty Thanks for the information. Both of your host and container mountpoints need to be marked as shared to propagate mount events each other.
What's the actual propagation flag added to /var/
(cat /proc/self/mountinfo | grep /var/
shows this) in the stargz-snapshotter container? And what is your host /
(or /var/
) 's actual propagation flag (can be inspected from /proc/self/mountinfo
on the host)? If the host is marked with non-shared flag, mount events won't be propagated there.
Going to close this for now, as I've been able to prove that switching into the PID 1 mount namespace makes this work as expected. Something on the Talos Linux side with mount propagation seems to be the most likely culprit. Thx for the help @ktock
Hi there!
I'm working on a system extension for stargz-snapshotter to run inside of Talos Linux (an OS specifically for Kubernetes). I've managed to get the snapshotter compiled and running when the system boots, but I seem to be having a problem when it comes to actually launching pods.
Using the pre-built image, a pod launch looks like this:
The image seems to pull and then the pod fails with a
RunContainerError
like:Running the pod with a defined
command:
pointing directly to/bin/sh
fails in the same way. I haven't been able to find anything notable in the snapshotter logs or containerd logs, only errors like this in the snapshotter (which I think are just caused by the pod crashlooping):It should be noted that with Talos, these system extensions run as a container and I'm mounting /dev, /var, and /run into this container.
My config.toml for the snapshotter is currently empty. The containerd config looks like the following, which is a merge of the configs required from the snapshotter docs, as well as the Talos defaults:
It feels like I'm missing a simple mount or a configuration somewhere and I'm wondering if anyone may have seen this before and can help push me in the right direction. Thanks!