containerd / runwasi

Facilitates running Wasm / WASI workloads managed by containerd
Apache License 2.0
1.06k stars 88 forks source link

WASM Pod terminating stack in Kubernetes #418

Closed mochizuki875 closed 6 months ago

mochizuki875 commented 10 months ago

What happened?

I've attempt to deploy WASM Pod on Kubernetes and WASM Pod is sccessfly created and running. However, when I delete the WASM Pod, the WASM Pod is not terminated and the status remains as Terminating. I don't have deep understanding in this area and don't know if it is certainly issue of runwasi

What did you expect to happen?

WASM Pod is terminated.

How can we reproduce it (as minimally and precisely as possible)?

Environment

The architecture is as follow:

Kubernetes -> Containerd -> runwasi(containerd-shim-wasmedge) -> WasmEdge

Containerd's runtime configuration is as follow:

`/etc/containerd/config.toml` ```toml ... [plugins."io.containerd.grpc.v1.cri".containerd.runtimes] [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.wasmedge] runtime_type = "io.containerd.wasmedge.v1" ... ```

Kubernetes RuntimeClass for WasmEdge is as follow:

`wasmedge-runtimeclass.yaml` ```yaml apiVersion: node.k8s.io/v1 kind: RuntimeClass metadata: name: wasmedge handler: wasmedge ```

WASM Pod manifest is as follow:

`wasm-pod-manifest.yaml` ```yaml apiVersion: v1 kind: Pod metadata: labels: run: wasm-example name: wasm-example spec: runtimeClassName: wasmedge containers: - image: michaelirwin244/wasm-example name: wasm-example ```

Reproduction steps

Create WASM Pod and it will sccefully created and running.

$ kubectl apply -f wasm-pod-manifest.yaml
pod/wasm-example created

$ kubectl get pod
NAME           READY   STATUS    RESTARTS   AGE
wasm-example   1/1     Running   0          42s

$ kubectl logs wasm-example
Server is now running

Delete WASM Pod and it will not finish terminating.

$ kubectl delete pod wasm-example
pod "wasm-example" deleted
(it is stack here)

$ kubectl get pod
NAME           READY   STATUS    RESTARTS   AGE
wasm-example   1/1     Terminating   0          22m

Then Containerd's log is as follow:

```bash # journalctl -xu containerd -f ... Dec 07 08:48:14 k8s-cluster2-wasm-node01 containerd[87842]: time="2023-12-07T08:48:14.419548422Z" level=info msg="stats: StatsRequest { id: "1ac8dd01004716ad40c51d133777841796a80692484ba8aa4f55605d4f390c23", special_fields: SpecialFields { unknown_fields: UnknownFields { fields: None }, cached_size: CachedSize { size: 0 } } }" Dec 07 08:48:23 k8s-cluster2-wasm-node01 containerd[87842]: time="2023-12-07T08:48:23.967324299Z" level=info msg="StopContainer for \"1ac8dd01004716ad40c51d133777841796a80692484ba8aa4f55605d4f390c23\" with timeout 30 (s)" Dec 07 08:48:25 k8s-cluster2-wasm-node01 containerd[87842]: time="2023-12-07T08:48:25.967807443Z" level=error msg="get state for 1ac8dd01004716ad40c51d133777841796a80692484ba8aa4f55605d4f390c23" error="context deadline exceeded: unknown" Dec 07 08:48:25 k8s-cluster2-wasm-node01 containerd[87842]: time="2023-12-07T08:48:25.967870558Z" level=warning msg="unknown status" status=0 Dec 07 08:48:25 k8s-cluster2-wasm-node01 containerd[87842]: time="2023-12-07T08:48:25.967905520Z" level=info msg="Stop container \"1ac8dd01004716ad40c51d133777841796a80692484ba8aa4f55605d4f390c23\" with signal terminated" ```

Anything else we need to know?

$ kubectl apply -f wasm-pod-manifest-2.yaml
pod/wasm-example-2 created

$ kubectl get pod
NAME             READY   STATUS    RESTARTS   AGE
wasm-example     1/1     Running   0          7m40s
wasm-example-2   1/1     Running   0          42s

$ kubectl delete pod wasm-example-2
pod "wasm-example-2" deleted
(Successfully terminated)
utam0k commented 10 months ago

Maybe it's stuck in the libcontainer part...?

endocrimes commented 9 months ago

I'm also seeing this with containard-shim-spin on ubuntu-22.04:

containerd logs (looping between StopPodSandbox/StopContainer for two wasi pods): ``` containerd[828525]: time="2024-01-09T11:26:08.785253398Z" level=info msg="StopPodSandbox for \"a17b5db250a03dc3ee9e26ac62d023d3bd0b0c34eef13dd2bb49aa32138515b2\"" containerd[828525]: time="2024-01-09T11:26:37.343731477Z" level=info msg="StopContainer for \"6e4bb8993e2209ef64876dcd29be3ba6de0441478b37bf8b8d5cc67edb300238\" with timeout 30 (s)" containerd[828525]: time="2024-01-09T11:28:09.615996212Z" level=info msg="StopContainer for \"45d741559a1c0fc3ab6407b834f2a2ddb4b5e90fd22073319d7129496a60876a\" with timeout 30 (s)" containerd[828525]: time="2024-01-09T11:29:07.343440954Z" level=info msg="StopPodSandbox for \"c0597679a0504dd09178fd50e10c8092b9bffb543fe5f4822e6357a8da3752c6\"" containerd[828525]: time="2024-01-09T11:30:39.617341522Z" level=info msg="StopPodSandbox for \"a17b5db250a03dc3ee9e26ac62d023d3bd0b0c34eef13dd2bb49aa32138515b2\"" containerd[828525]: time="2024-01-09T11:31:08.163593954Z" level=info msg="StopContainer for \"6e4bb8993e2209ef64876dcd29be3ba6de0441478b37bf8b8d5cc67edb300238\" with timeout 30 (s)" containerd[828525]: time="2024-01-09T11:32:40.447958480Z" level=info msg="StopContainer for \"45d741559a1c0fc3ab6407b834f2a2ddb4b5e90fd22073319d7129496a60876a\" with timeout 30 (s)" containerd[828525]: time="2024-01-09T11:33:38.165245928Z" level=info msg="StopPodSandbox for \"c0597679a0504dd09178fd50e10c8092b9bffb543fe5f4822e6357a8da3752c6\"" containerd[828525]: time="2024-01-09T11:35:10.448555046Z" level=info msg="StopPodSandbox for \"a17b5db250a03dc3ee9e26ac62d023d3bd0b0c34eef13dd2bb49aa32138515b2\"" containerd[828525]: time="2024-01-09T11:35:38.998224648Z" level=info msg="StopContainer for \"6e4bb8993e2209ef64876dcd29be3ba6de0441478b37bf8b8d5cc67edb300238\" with timeout 30 (s)" containerd[828525]: time="2024-01-09T11:37:11.249254293Z" level=info msg="StopContainer for \"45d741559a1c0fc3ab6407b834f2a2ddb4b5e90fd22073319d7129496a60876a\" with timeout 30 (s)" containerd[828525]: time="2024-01-09T11:38:08.999180404Z" level=info msg="StopPodSandbox for \"c0597679a0504dd09178fd50e10c8092b9bffb543fe5f4822e6357a8da3752c6\"" containerd[828525]: time="2024-01-09T11:39:41.249684737Z" level=info msg="StopPodSandbox for \"a17b5db250a03dc3ee9e26ac62d023d3bd0b0c34eef13dd2bb49aa32138515b2\"" containerd[828525]: time="2024-01-09T11:40:09.737035165Z" level=info msg="StopContainer for \"6e4bb8993e2209ef64876dcd29be3ba6de0441478b37bf8b8d5cc67edb300238\" with timeout 30 (s)" containerd[828525]: time="2024-01-09T11:41:42.008729782Z" level=info msg="StopContainer for \"45d741559a1c0fc3ab6407b834f2a2ddb4b5e90fd22073319d7129496a60876a\" with timeout 30 (s)" containerd[828525]: time="2024-01-09T11:42:39.737514721Z" level=info msg="StopPodSandbox for \"c0597679a0504dd09178fd50e10c8092b9bffb543fe5f4822e6357a8da3752c6\"" containerd[828525]: time="2024-01-09T11:44:12.009757691Z" level=info msg="StopPodSandbox for \"a17b5db250a03dc3ee9e26ac62d023d3bd0b0c34eef13dd2bb49aa32138515b2\"" containerd[828525]: time="2024-01-09T11:44:40.461942787Z" level=info msg="StopContainer for \"6e4bb8993e2209ef64876dcd29be3ba6de0441478b37bf8b8d5cc67edb300238\" with timeout 30 (s)" ```

Given the Kubelet side of these logs are filled with timeouts being stuck somewhere in libcontainer would not be too surprising. I'm not really a rust person, but I'll try and get a runwasi dev environment setup later to at least get a little more insight into what could be failing here.

devigned commented 9 months ago

I think this might be a bug that has been resolved in newer versions of the shims. This issue leads me to believe that a previous version of some of the underlying shared shim implementation had an issue terminating which when using a newer version of the shim it is resolved (via kwasm node installer): https://github.com/ZEISS/enterprise-wasm/issues/4#issuecomment-1882558627.

However, we should add some integration tests to scale up and down and verify the pod is properly terminated.

utam0k commented 9 months ago

@mochizuki875 May I ask you to give it a try with the latest version?

endocrimes commented 9 months ago

@devigned I saw that and was surprised (bc I was using the latest kwasm-operator release - 0.2.3, which pulls in the latest node-installer image). So I think there might be something Weird going on 😅

mochizuki875 commented 9 months ago

@utam0k Sure! However I accidentally destroyed previous environment... So please give me a little time:)

mochizuki875 commented 9 months ago

BTW is the latest version v0.4.0?

Mossaka commented 9 months ago

The latest version of containerd-shim-wasm crate is v0.4.0, so yes.

I would recommend you to build the WasmEdge shim by yourself.

cpuguy83 commented 8 months ago

Is there some case where we aren't sending a container exit event and/or not releasing the wait endpoint? IIRC with cri containerd its using the wait API call to determine when a container has exited.

cpuguy83 commented 8 months ago

To expand on that:

The cri module in containerd is listening on the wait endpoint on the shim for every container. I'm expecting that cri is waiting for this to return. Probably we need to look into the interactions with that endpoint. Maybe there's a deadlock or for whatever reason the endpoint isn't releasing.

rajatjindal commented 6 months ago

I am trying to debug this and found one datapoint which may be useful. If you notice the output below, the PID is 0 and status is unknown.

(some lines removed from o/p for readability)

time k3s ctr -n k8s.io --address /run/k3s/containerd/containerd.sock tasks list
TASK                                                                PID     STATUS
535c2f86d457f73cb2b6184074e9bb008555516f710d81fe7beb1d2572cf7d91    0       UNKNOWN
bebe3c8ccae5e2140b36e94c5b2f61d983ed55279359b777e49fd44663d8f826    0       UNKNOWN

74d8e1e672d0795639c21bffb8a40b886e80e2cb93980f90258f2d249c20c735    8170    RUNNING
a30c663ccc13c3a0ca4e68519140886c3d028a4ed0f1d3791b536f319fe21379    7879    RUNNING
.
.
.

real    0m20.114s
user    0m0.118s
sys 0m0.031s

upon further checking one of them is the pause container and one is the wasm workload that I am trying to run.

environment:

k3s based single-node cluster on Ubuntu 22.

rajatjindal commented 6 months ago

hello folks, thank you for the suggestions and pointers during the runwasi community meeting. As discussed there, following are the steps I used to reproduce the issue:

pre-requisite:

ubuntu 22.04

start k3s cluster

curl -sfL https://get.k3s.io | sh -
[INFO]  Finding release for channel stable
[INFO]  Using v1.28.7+k3s1 as release
[INFO]  Downloading hash https://github.com/k3s-io/k3s/releases/download/v1.28.7+k3s1/sha256sum-amd64.txt
[INFO]  Downloading binary https://github.com/k3s-io/k3s/releases/download/v1.28.7+k3s1/k3s
[INFO]  Verifying binary download
[INFO]  Installing k3s to /usr/local/bin/k3s
[INFO]  Skipping installation of SELinux RPM
[INFO]  Skipping /usr/local/bin/kubectl symlink to k3s, already exists
[INFO]  Creating /usr/local/bin/crictl symlink to k3s
[INFO]  Skipping /usr/local/bin/ctr symlink to k3s, command exists in PATH at /usr/bin/ctr
[INFO]  Creating killall script /usr/local/bin/k3s-killall.sh
[INFO]  Creating uninstall script /usr/local/bin/k3s-uninstall.sh
[INFO]  env: Creating environment file /etc/systemd/system/k3s.service.env
[INFO]  systemd: Creating service file /etc/systemd/system/k3s.service
[INFO]  systemd: Enabling k3s unit
Created symlink /etc/systemd/system/multi-user.target.wants/k3s.service → /etc/systemd/system/k3s.service.
[INFO]  systemd: Starting k3s

download the latest version of containerd-shim-spin

mkdir -p /opt/kwasm/bin
wget https://github.com/spinkube/containerd-shim-spin/releases/download/v0.12.0-rc2/containerd-shim-spin-v2-linux-x86_64.tar.gz -O /tmp/containerd-shim-spin-v2-linux-x86_64.tar.gz
tar -xvf /tmp/containerd-shim-spin-v2-linux-x86_64.tar.gz -C /opt/kwasm/bin/

configure containerd to allow running spin wasm workloads

cp /var/lib/rancher/k3s/agent/etc/containerd/config.toml /var/lib/rancher/k3s/agent/etc/containerd/config.toml.tmpl

add following two lines to /var/lib/rancher/k3s/agent/etc/containerd/config.toml.tmpl

[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.spin]
    runtime_type = "/opt/kwasm/bin/containerd-shim-spin-v2"

restart k3s

systemctl restart k3s

Run ctr task list, observe that it returns quickly

ctr -n k8s.io --address /run/k3s/containerd/containerd.sock task list
TASK                                                                PID       STATUS
fef79f242766fea997ba62f1d3627d2069461eb81867c4de4835fc46aa5f1e1b    290089    RUNNING
191e8c838f4631a6252cdf7cec000fa853aec21d26b26dc754602e592094f5f8    290079    RUNNING
37274ca83477c70b122e545720835e90200a9d8fae57eba62d479a6b5d169ae8    291371    RUNNING
98a6394f43205f4aaf2a8c4c8e0613402518c3f119ea14527db82b349eab67f2    290256    RUNNING
a5ad7646e9ed629602dee9ff753d67aaf24bf45f4ff73620f36b5ae5295c4c7d    291421    RUNNING
a8ed998e6a27c339f2c8ff03fd689c5dae7c8e0e27954e86a47bb47c7ceb287b    291494    RUNNING
27cdf74b6c01c75f6d5f873d974cf7d0dc395016bd7742c73f307dfdbaf41539    290103    RUNNING
346759312b325c7fbf92e3dc1c0e2988ae432d0b4f6487b7d06d9f968f243361    291085    RUNNING
7043071b044490e218401ee1a114373990e5e0271c015459451881d55e460829    290324    RUNNING
d1528dab8d3bc2866682fe1e7a7125c2500b9e59fc0c5eef54e4962f0711a925    291171    RUNNING
d8d80381abe3f2ba67bfc93925b5834e4fd51c2e821fcc6f093142bde637191f    290369    RUNNING

create a spin workload deployment

## apply spin runtime class

kubectl apply -f - <<EOF
apiVersion: node.k8s.io/v1
kind: RuntimeClass
metadata:
  name: wasmtime-spin-v2
handler: spin
EOF

## start spin workload

kubectl apply -f - <<EOF
apiVersion: apps/v1
kind: Deployment
metadata:
  name: spin-test
spec:
  replicas: 1
  selector:
    matchLabels:
      app: spin-test
  template:
    metadata:
      labels:
        app: spin-test
    spec:
      runtimeClassName: wasmtime-spin-v2
      containers:
      - name: spin-test
        image: ghcr.io/deislabs/containerd-wasm-shims/examples/spin-rust-hello:v0.10.0
        command: ["/"]
        ports:
        - containerPort: 80
EOF

## check pods are up and running
kubectl get pods

spin-test-7bdcb9d5b8-6fwvh   1/1     Running   0          3s

observe ctr task list takes long time and return wasm workload/pause containers with pid 0 and status unknown

ctr -n k8s.io --address /run/k3s/containerd/containerd.sock task list
TASK                                                                PID       STATUS
cdff7d32cf3955f57802e88aa1c813fa087d4bffddcfbf7ae2e1e318614d848a    285693    RUNNING
ea074ad60fd299b82f4710d6e7e905bacc4a9dc9c5966ca5c1729e48ce03952b    285555    RUNNING
919bacf90510ee008950e1ba033a4686f42e4b0af9f7f1c39a7bb9369fdcf074    0         UNKNOWN
ad58b36df016468f1323a26349235f49238918e403b1e6ab7ad6e40ad0fd2387    0         UNKNOWN
1afc136e7b1b1b89fbac78c14cc0acb84330851d70d18efd387d2120e0f1edaa    285267    RUNNING
da9a313e0c3269a81eb5df7c7a5d790c1943464fa5e0d82973e4cd4163e5e691    284559    RUNNING
7fd9b709ae8d80fdee4c4e854a6d544e2e92d640999534f25a5c57256d41d7da    285609    RUNNING
55c68a2f8542551dd4a8dbfebbaa7e99e0fb47d85a8391f2a55fce1049f6b830    284442    RUNNING
59893be712262508c837f65b2abed0d42c6917ffa958d291aa1498e623545e99    284301    RUNNING
803609b93627efdc67507d49092b07e2a6e3f8ca096b849ed039aa9dab324f00    285353    RUNNING
08f011bd6713e9d500c5f5650340fbb972900cace62aa58adb6a458860667379    284287    RUNNING
9d2edc70b30b1a23425996bd4f1917be948a2faa414364a1bd7858a1e63638b0    284255    RUNNING
459aa745597f634cf9ad19387b62bcb3d6b3c47843bab1f971e2adbeb7a8575f    284515    RUNNING

try deleting the pod, observe it remains stuck

kubectl delete pod spin-test-7bdcb9d5b8-c8qqf
pod "spin-test-7bdcb9d5b8-c8qqf" deleted

attachment: containerd.log

I tried to run through the steps after cleaning up my dev machine. if any of the steps above does not work as expected, pls let me know and I can try to check why that is.

thanks again

kate-goldenring commented 6 months ago

@rajatjindal pointed out that this may be related to this issue https://github.com/containerd/ttrpc/issues/72

rajatjindal commented 6 months ago

I wrote a minimal ttrpc client to debug this:

ttrpc-client/main.go ``` package main import ( "context" "fmt" "net" "os" "github.com/containerd/containerd/api/runtime/task/v2" ttrpc "github.com/containerd/ttrpc" ) func main() { socket := os.Args[1] id := os.Args[2] conn, err := net.Dial("unix", socket) if err != nil { fmt.Println(err) os.Exit(1) } client := ttrpc.NewClient(conn) taskclient := task.NewTaskClient(client) resp, err := taskclient.State(context.TODO(), &task.StateRequest{ ID: id, }) // taskclient.Stats() if err != nil { fmt.Println(err) os.Exit(1) } fmt.Println(resp.Bundle) } ```

When I run above client with a working container, it prints the bundle dir as expected:

go run main.go /run/containerd/s/ff1283bad53b7c324482797129a32896822a7693a8bd21c8b375b7e22914e543 27cdf74b6c01c75f6d5f873d974cf7d0dc395016bd7742c73f307dfdbaf41539
/run/k3s/containerd/io.containerd.runtime.v2.task/k8s.io/27cdf74b6c01c75f6d5f873d974cf7d0dc395016bd7742c73f307dfdbaf41539

but with shim wasm workload it hangs:

go run main.go /run/containerd/23d3abc7d95c0d4f.sock 1597ad9a2ca6fabc19c859a482e64fed9a5fea722491011589469b0d29cda4d3
<hangs here>

Further I tried to check if the socket is listening and who owns it:

with a working container (non wasm workload):

(notice the output has a line with LISTENING and that there is a /pause process)

## check which process is holding ttrpc socket file

netstat --all --program | grep ff1283bad53b
unix  3      [ ]         STREAM     CONNECTED     2907755  290017/containerd-s  /run/containerd/s/ff1283bad53b7c324482797129a32896822a7693a8bd21c8b375b7e22914e543
unix  2      [ ACC ]     STREAM     LISTENING     2766418  290017/containerd-s  /run/containerd/s/ff1283bad53b7c324482797129a32896822a7693a8bd21c8b375b7e22914e543
unix  3      [ ]         STREAM     CONNECTED     2906647  290017/containerd-s  /run/containerd/s/ff1283bad53b7c324482797129a32896822a7693a8bd21c8b375b7e22914e543

## check the process details
root      290017       1  0 Mar12 ?        00:00:03 /var/lib/rancher/k3s/data/a3b46c0299091b71bfcc617b1e1fec1845c13bdd848584ceb39d2e700e702a4b/bin/containerd-shim-runc-v2 -namespace k8s.io -id 27cdf74b6c01c75f6d5f873d974cf7d0dc395016bd7742c73f307dfdbaf41539 -address /run/k3s/containerd/containerd.sock
65535     290103  290017  0 Mar12 ?        00:00:00 /pause
ubuntu    290369  290017  0 Mar12 ?        00:01:47 /metrics-server --cert-dir=/tmp --secure-port=10250 --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname --kubelet-use-node-status-port --metric-resolution=15s --tls-cipher-suites=TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305

while for wasm workload shim output looks as follows:

(notice some CONNECTING in output but no LISTENING and also no /pause process)

netstat --all --program | grep 23d
unix  3      [ ]         STREAM     CONNECTED     3143954  312194/containerd-s  /run/containerd/23d3abc7d95c0d4f.sock
unix  2      [ ]         STREAM     CONNECTING    0        -                    /run/containerd/23d3abc7d95c0d4f.sock
unix  2      [ ]         STREAM     CONNECTING    0        -                    /run/containerd/23d3abc7d95c0d4f.sock
unix  2      [ ]         STREAM     CONNECTING    0        -                    /run/containerd/23d3abc7d95c0d4f.sock
unix  2      [ ]         STREAM     CONNECTING    0        -                    /run/containerd/23d3abc7d95c0d4f.sock
unix  2      [ ]         STREAM     CONNECTING    0        -                    /run/containerd/23d3abc7d95c0d4f.sock
unix  2      [ ]         STREAM     CONNECTING    0        -                    /run/containerd/23d3abc7d95c0d4f.sock
unix  3      [ ]         STREAM     CONNECTED     3143212  312194/containerd-s  /run/containerd/23d3abc7d95c0d4f.sock
unix  2      [ ]         STREAM     CONNECTING    0        -                    /run/containerd/23d3abc7d95c0d4f.sock

## checking the process details
root      312194       1  0 00:23 ?        00:00:17 /opt/kwasm/bin/containerd-shim-spin-v2 -namespace k8s.io -id 1597ad9a2ca6fabc19c859a482e64fed9a5fea722491011589469b0d29cda4d3 -address /run/k3s/containerd/containerd.sock
ubuntu    318893  318225  0 02:36 pts/6    00:00:00 grep --color=auto 312194

update: maybe not having a /pause process is expected because https://github.com/containerd/runwasi/blob/a8da71307e546f0cfe417c5982a286e03a57d120/crates/containerd-shim-wasm/src/sandbox/shim/local.rs#L151

rajatjindal commented 6 months ago

going further into this debug direction:

ss | grep bdbf0268251c7b33

u_str ESTAB  154    0                                                   /run/containerd/bdbf0268251c7b33.sock 9080036                          * 0
u_str ESTAB  1190   0                                                   /run/containerd/bdbf0268251c7b33.sock 9080058                          * 0

the number 154 and 1190 had heading Recv-Q in the above output.

rajatjindal commented 6 months ago

just an update on debugging this:

  1. my current working theory is that ttrpc socket that shim starts, stops listening for some reason.
  2. In order to debug it I wrote a ttrpc-proxy to ensure I can dynamically log all the requests being sent to this socket. (this watches for new sockets started by shim and starts proxy for it automatically)
  3. so far I am NOT able to consistently make this proxy work for all pod starts (due to race btw starting proxy and shim getting started. I am still working on making this work for every pod start), BUT
  4. whenever this proxy is started in time for a given pod, that pod is deleted successfully when requested to delete.

above steps, in addition to comments I added to the issue earlier, makes me feel that I am still making progress in the right direction and I will continue to debug it with this for now.

if you have any insights as to why we might be seeing this ttrpc behavior, (or if you think this is not the right direction to debug this issue), those will be very helpful.

Mossaka commented 6 months ago

Thanks for digging into this @rajatjindal and I believe you are in the right direction. I could reproduce this issue with my local k3s cluster in ubuntu. Will do more investigation on it

Mossaka commented 6 months ago

I can verify that this happens on both latest and older version of k3s (e.g. v1.27.1+k3s1)

Mossaka commented 6 months ago

Interestingly, I could not reproduce this issue with the wasmtime shim.

Applying wasmtime workloads

k get po     
NAME                         READY   STATUS    RESTARTS   AGE
wasi-demo-6c5dcf6ddb-v75l8   2/2     Running   0          28s
wasi-demo-6c5dcf6ddb-qn8k7   2/2     Running   0          28s
wasi-demo-6c5dcf6ddb-mbnb5   2/2     Running   0          28s
sudo ctr -n k8s.io --address /run/k3s/containerd/containerd.sock task list
TASK                                                                PID        STATUS    
153872d5e15fa359d51cfd989fd0b4eda61f57865aa661c1c97d692fb33fa36f    1166646    RUNNING
211afdce974e6be87ea9ef44627805931c1a018868e6822133400474e0a615fc    1167640    RUNNING
2ef159fb84388f3640e1fa0267cbc46d0917ffb9111877b3c0d7e9422aea088d    1167449    RUNNING
3e07e808382fdb65b205f87ad6e29874fd58012809784f2e6991fca10800bdee    1166386    RUNNING
bc9880cf00931317d3001805a65f56f0cdca1b8ee7c4a92c04b903c0dbacf46f    1166290    RUNNING
6215eb541460ed8e5f42007970a1ca7aac63e6aef6576371cadd543e9792d562    1170221    RUNNING
6171486f830bf174ea29dc233ad2584bb4293296e263d131aebcb17e67cbf923    1167881    RUNNING
5ca536c77fc9dd77255312352687e99762e7603b569869b9683b7f70ee40c3d9    1170077    RUNNING
dcfb4f3005ebcf652770485d277687355ab3980c83c255b1a78442ee762bf13e    1170078    RUNNING
0ea09d624e471804bee930879f6f207b85ffd1fd6c8d5ca2eda47df3991aa016    1170220    RUNNING
e8c13301e036bd9185ddb9a2c507b8e4f60d318b36f3a7e373cfecf6a7b6223e    1170222    RUNNING
0dfbcf4bc8d76581aecede1a1fcb27c7e48ec445ca0aac486c0ba282553ffd81    1166301    RUNNING
18651052148a7e1e0ee358a974fd4a13e0cdf9bd2f2f2abbdcb377062a9716b6    1167563    RUNNING
90c5b1dc67a725d58f51c130878c1192f3a3d8ff6e3396dc1ada9462e43f67f8    1167693    RUNNING
e2eb4ed9372ea8fa933f8251dc5bc8d14eae87bee056d031c184673f61710b71    1166597    RUNNING
0fe7d7697e8d741d54b6166c4251e3a52b0e47d70f9c256fbb6e034ad2adcbf6    1169927    RUNNING
c0b943ee316c38a4cfd846eb2c4300d095ed48148c1fa724aabad43723b2f5b6    1170082    RUNNING
1701687094cf866f0de01a6b50b4d5bc53822265d315866e27a7c28e30afccbd    1166533    RUNNING
ee01e6a7c189473726781ae8703b17e94a3790eeac157441901b13265d6a7699    1169982    RUNNING
fd3f5fe4d9354f283db253ef5db20cedf17b8a0dd7b706c78bd9234e52135ee2    1169995    RUNNING

There were no UNKNOWN and pid = 0 tasks

Deleting the workloads

k get po                                                                  
NAME                         READY   STATUS        RESTARTS   AGE
wasi-demo-6c5dcf6ddb-mbnb5   2/2     Terminating   0          98s
wasi-demo-6c5dcf6ddb-v75l8   2/2     Terminating   0          98s
wasi-demo-6c5dcf6ddb-qn8k7   2/2     Terminating   0          98s

k get po
No resources found in default namespace.

But, it seems like the shim processes are still running

ps aux | grep wasmtime
root     1160661  0.0  0.1 2102440 41764 ?       Sl   00:37   0:00 /usr/local/bin/containerd-shim-wasmtime-v1 -namespace k8s.io -id cd07d8cb1969dc97a86e5581bfc924cfff3d2f56a52e3f4814d234fa07fb2ce5 -address /run/k3s/containerd/containerd.sock
root     1160689  0.0  0.1 2231484 44272 ?       Sl   00:37   0:00 /usr/local/bin/containerd-shim-wasmtime-v1 -namespace k8s.io -id 3265d71c0434886d907c7a77beeb1a453aac0fdcc8bb7eeb474c4fb289cadc83 -address /run/k3s/containerd/containerd.sock
root     1160706  0.0  0.1 2366660 44668 ?       Sl   00:37   0:00 /usr/local/bin/containerd-shim-wasmtime-v1 -namespace k8s.io -id 664ef80c141f5c14133acfb6b6c5ad1b1e1f6847dfabcdefec9b04bd1952e16d -address /run/k3s/containerd/containerd.sock
root     1160750  0.7  0.1 10218360 58836 ?      Ssl  00:37   0:03 /usr/local/bin/containerd-shim-wasmtime-v1 -namespace k8s.io -id cd07d8cb1969dc97a86e5581bfc924cfff3d2f56a52e3f4814d234fa07fb2ce5 -address /run/k3s/containerd/containerd.sock
root     1160797  0.7  0.1 10555800 61288 ?      Ssl  00:37   0:03 /usr/local/bin/containerd-shim-wasmtime-v1 -namespace k8s.io -id 664ef80c141f5c14133acfb6b6c5ad1b1e1f6847dfabcdefec9b04bd1952e16d -address /run/k3s/containerd/containerd.sock
root     1160800  0.7  0.1 10420944 63736 ?      Ssl  00:37   0:03 /usr/local/bin/containerd-shim-wasmtime-v1 -namespace k8s.io -id 3265d71c0434886d907c7a77beeb1a453aac0fdcc8bb7eeb474c4fb289cadc83 -address /run/k3s/containerd/containerd.sock
root     1175533  0.6  0.1 2639124 39868 ?       Sl   00:45   0:00 /usr/local/bin/containerd-shim-wasmtime-v1 -namespace k8s.io -id d26d201fa15247aa51fc09c82ca574d808147f28aa1d130325378e6707f92e26 -address /run/k3s/containerd/containerd.sock
root     1175562  0.6  0.1 2100448 39808 ?       Sl   00:45   0:00 /usr/local/bin/containerd-shim-wasmtime-v1 -namespace k8s.io -id f6c4a291a9cde6862937b6cce78c8c116b835421efa3ef32ac4fcf09413b1405 -address /run/k3s/containerd/containerd.sock
root     1175582  0.8  0.1 2309392 39532 ?       Sl   00:45   0:00 /usr/local/bin/containerd-shim-wasmtime-v1 -namespace k8s.io -id 2f0107d61c8ea7fa5c7adede24254dab5b6b4062c084ded4ce63d441c7862ea2 -address /run/k3s/containerd/containerd.sock
root     1175655 74.0  0.1 10353000 55932 ?      Ssl  00:45   0:03 /usr/local/bin/containerd-shim-wasmtime-v1 -namespace k8s.io -id f6c4a291a9cde6862937b6cce78c8c116b835421efa3ef32ac4fcf09413b1405 -address /run/k3s/containerd/containerd.sock
root     1175657 74.6  0.1 10353000 58068 ?      Ssl  00:45   0:03 /usr/local/bin/containerd-shim-wasmtime-v1 -namespace k8s.io -id d26d201fa15247aa51fc09c82ca574d808147f28aa1d130325378e6707f92e26 -address /run/k3s/containerd/containerd.sock
root     1175664 74.6  0.1 10353000 57628 ?      Ssl  00:45   0:03 /usr/local/bin/containerd-shim-wasmtime-v1 -namespace k8s.io -id 2f0107d61c8ea7fa5c7adede24254dab5b6b4062c084ded4ce63d441c7862ea2 -address /run/k3s/containerd/containerd.sock

I've attached containerd.log here if you found it useful. containerd.log

jprendes commented 6 months ago

Is the problem exclusive to the pause container? If so, we could try by removing the special handling of pause container, and let youki handle it as any other native container.

jprendes commented 6 months ago

@rajatjindal pointed out that this may be related to this issue https://github.com/containerd/ttrpc/issues/72

Is the problem in the go codebase in containerd or on the shim codebase with ttrpc-rust?

rajatjindal commented 6 months ago

Hi @jprendes,

I am still quite new to containerd/shim workflow/codebase and trying out different things. so I cannot say for sure if the issue is in go side of containerd or ttrpc-rust or somewhere else.

I have been debugging this more, and it seems ttrpc stats api get stuck (or crash) at when getting cgroup data here: https://github.com/containerd/rust-extensions/blob/798ab74d2019880533c4f80399ba52378a7da1c2/crates/shim/src/cgroup.rs#L186

one more thing I noticed here is the difference in format of the cgroup path (refer below). I am not entirely sure if this is expected, but it seems like for the path as seen for spin container, the above code will crash.

## regular container
-> cat /proc/1377322/cgroup

0::/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod527450a4_806f_4e28_846d_aa33ddfe8d19.slice/cri-containerd-5bc56bb5961c8df6e6be6c12900deb1269af5d6689e59c94d3c09e537432139d.scope

## spin container
-> cat /proc/1378687/cgroup

0::/kubepods-besteffort-podeed5d41d_dba7_48f8_a134_4cdd83154ca0.slice:cri-containerd:73e6bee7c0ebcb39f9a2adf3c58277017c8e079bf7772dce23320fc0495a648e

update:

I can confirm that once i comment out code to call cgroup_metrics here https://github.com/containerd/runwasi/blob/a8da71307e546f0cfe417c5982a286e03a57d120/crates/containerd-shim-wasm/src/sandbox/shim/local.rs#L323, i am able to delete pods successfully and also shim process exits as expected

jsturtevant commented 6 months ago

closed with #254