Closed divya-mohan0209 closed 5 days ago
Thanks for opening this, @divya-mohan0209! This should help triage where the issues should be (my assumption is most issues would come from https://github.com/spinkube/containerd-shim-spin).
Keeping this open until we create issues for each of these.
Thanks!
Okie dokie, thank you @radu-matei! I shall keep this in mind next time I open issues :)
Hey, @divya-mohan0209 -- just tried all all applications you referenced on a cluster with the latest release of SpinKube and the latest release of the shim, and couldn't reproduce ti with any of the applications.
The most likely cause here I think would be running an old version of the shim -- which might come pre-baked into Rancher Desktop.
Could you please run kubectl annotate node --all kwasm.sh/kwasm-node=true
one more time to force KWasm to update?
It is listed as one of the steps, but how can I check if the version is updated? and what is the expected version it needs to be updated to?
Also, the thing is it runs when you first run all the applications. But when you reset Rancher Desktop and retry the steps all over again, it doesn't work.
Yeah, I think it has something to do with the shim version used. @rajatjindal has a one-liner to verify the version.
In the meantime, tagging @tpmccallum who wrote the instructions for Rancher Desktop, if we need to update them.
Also, retried it just now by rerunning the script. No luck.
I created the Rancher Desktop cluster, and noticed below (that it is indeed old version of shim).
@divya-mohan0209, could you please verify this on your cluster as well.
kubectl debug -it node/lima-rancher-desktop --image ubuntu:latest -n default -- /host/usr/local/containerd-shims/containerd-shim-spin-v2 -v
containerd-shim-spin-v2:
Runtime: spin
Version: 0.11.1
Revision: 7058f601f3e92ee
Just supplementing the error logs here, as well.
time="2024-04-30T12:59:26.854405727Z" level=info msg="CreateContainer within sandbox \"2939d8afcf2be62b2962cecfa7a0572f02f0da852f701bf1f9cf9260919e80a0\" for container &ContainerMetadata{Name:my-first-app,Attempt:0,}"
time="2024-04-30T12:59:26.856353602Z" level=info msg="CreateContainer within sandbox \"7b25fe8b3ff2a75d89bd1654396426f91f599aa00d7cc626089b28ffc8226dd3\" for &ContainerMetadata{Name:my-first-app,Attempt:0,} returns container id \"ca1f085e61081a93084dbe1c30e93a98ab7038e6c9ec5a1c119caabd363cd820\""
time="2024-04-30T12:59:26.856981436Z" level=info msg="StartContainer for \"ca1f085e61081a93084dbe1c30e93a98ab7038e6c9ec5a1c119caabd363cd820\""
time="2024-04-30T12:59:26.866778727Z" level=info msg="found manifest with WASM OCI image format."
time="2024-04-30T12:59:26.871746394Z" level=info msg="CreateContainer within sandbox \"2939d8afcf2be62b2962cecfa7a0572f02f0da852f701bf1f9cf9260919e80a0\" for &ContainerMetadata{Name:my-first-app,Attempt:0,} returns container id \"899a0e4ca27f947e6b02c5d431b7cd4b5fcb9bfaa7369c0978fe4c8279c33b45\""
time="2024-04-30T12:59:26.872443811Z" level=info msg="StartContainer for \"899a0e4ca27f947e6b02c5d431b7cd4b5fcb9bfaa7369c0978fe4c8279c33b45\""
time="2024-04-30T12:59:26.879040061Z" level=info msg="found manifest with WASM OCI image format."
time="2024-04-30T12:59:26.989514602Z" level=info msg="cgroup manager V2 will be used"
time="2024-04-30T12:59:26.997927102Z" level=info msg="cgroup manager V2 will be used"
time="2024-04-30T12:59:27.042846269Z" level=info msg="close_range; preserve_fds=0"
time="2024-04-30T12:59:27.043204978Z" level=warn msg="intermediate process already reaped"
time="2024-04-30T12:59:27.044068936Z" level=info msg="close_range; preserve_fds=0"
time="2024-04-30T12:59:27.044213894Z" level=warn msg="intermediate process already reaped"
time="2024-04-30T12:59:27.045217644Z" level=info msg="starting instance: ca1f085e61081a93084dbe1c30e93a98ab7038e6c9ec5a1c119caabd363cd820"
time="2024-04-30T12:59:27.045370853Z" level=info msg="calling start function"
time="2024-04-30T12:59:27.045397978Z" level=info msg="setting up wasi"
time="2024-04-30T12:59:27.046566228Z" level=info msg="starting instance: 899a0e4ca27f947e6b02c5d431b7cd4b5fcb9bfaa7369c0978fe4c8279c33b45"
time="2024-04-30T12:59:27.046550103Z" level=info msg=" >>> configuring spin oci application 111"
time="2024-04-30T12:59:27.046705936Z" level=info msg="calling start function"
time="2024-04-30T12:59:27.046745853Z" level=info msg="setting up wasi"
time="2024-04-30T12:59:27.047655519Z" level=info msg="StartContainer for \"ca1f085e61081a93084dbe1c30e93a98ab7038e6c9ec5a1c119caabd363cd820\" returns successfully"
time="2024-04-30T12:59:27.046855769Z" level=info msg="writing artifact config to cache, near "/.cache/registry/manifests""
time="2024-04-30T12:59:27.052521603Z" level=info msg="StartContainer for \"899a0e4ca27f947e6b02c5d431b7cd4b5fcb9bfaa7369c0978fe4c8279c33b45\" returns successfully"
time="2024-04-30T12:59:27.057878186Z" level=info msg=" >>> configuring spin oci application 111"
time="2024-04-30T12:59:27.057913728Z" level=info msg="writing artifact config to cache, near "/.cache/registry/manifests""
time="2024-04-30T12:59:27.060346811Z" level=info msg="writing spin oci config to "/spin.json""
time="2024-04-30T12:59:27.064799728Z" level=info msg="writing spin oci config to "/spin.json""
time="2024-04-30T12:59:27.111433603Z" level=info msg="error running start function: failed to resolve content for component "my-first-app""
time="2024-04-30T12:59:27.112347144Z" level=info msg="error running start function: failed to resolve content for component "my-first-app""
time="2024-04-30T12:59:27.114542228Z" level=info msg="no child process"
time="2024-04-30T12:59:27.115303978Z" level=error msg="ttrpc: received message on inactive stream" stream=21
time="2024-04-30T12:59:27.115418061Z" level=info msg="deleting instance: ca1f085e61081a93084dbe1c30e93a98ab7038e6c9ec5a1c119caabd363cd820"
time="2024-04-30T12:59:27.115589936Z" level=info msg="cgroup manager V2 will be used"
time="2024-04-30T12:59:27.115984811Z" level=info msg="shim disconnected" id=ca1f085e61081a93084dbe1c30e93a98ab7038e6c9ec5a1c119caabd363cd820 namespace=k8s.io
time="2024-04-30T12:59:27.116002519Z" level=warning msg="cleaning up after shim disconnected" id=ca1f085e61081a93084dbe1c30e93a98ab7038e6c9ec5a1c119caabd363cd820 namespace=k8s.io
time="2024-04-30T12:59:27.116011894Z" level=info msg="cleaning up dead shim" namespace=k8s.io
It is!
kubectl debug -it node/lima-rancher-desktop --image ubuntu:latest -n default -- /host/usr/local/containerd-shims/containerd-shim-spin-v2 -v
Creating debugging pod node-debugger-lima-rancher-desktop-f6c9b with container debugger on node lima-rancher-desktop.
containerd-shim-spin-v2:
Runtime: spin
Version: 0.11.1
Revision: 7058f601f3e92ee
But I have run the kubectl annotate node --all kwasm.sh/kwasm-node=true
twice and it still doesn't update the shim.
@divya-mohan0209 could you try:
kubectl annotate node --all kwasm.sh/kwasm-node-
kubectl annotate node --all kwasm.sh/kwasm-node=true
And check the jobs in the kwasm
namespace, then the shim version?
Yep, not looking good still.
The jobs:
pod/lima-rancher-desktop-provision-kwasm-htwv8 0/1 Unknown 0 42s
pod/lima-rancher-desktop-provision-kwasm-cs2wf 0/1 Completed 0 31s
The shim version:
~ kubectl debug -it node/lima-rancher-desktop --image ubuntu:latest -n default -- /host/usr/local/containerd-shims/containerd-shim-spin-v2 -v
Creating debugging pod node-debugger-lima-rancher-desktop-rvrl2 with container debugger on node lima-rancher-desktop.
containerd-shim-spin-v2:
Runtime: spin
Version: 0.11.1
Revision: 7058f601f3e92ee
Also, checked the kwasm logs for ya
2024-04-30T14:00:06.644825673Z stderr F {"level":"info","node":"lima-rancher-desktop","time":"2024-04-30T14:00:06Z","message":"Label removed. Removing Job."}
2024-04-30T14:00:13.891517177Z stderr F {"level":"info","node":"lima-rancher-desktop","time":"2024-04-30T14:00:13Z","message":"Trying to Deploy on lima-rancher-desktop"}
2024-04-30T14:00:13.897735427Z stderr F {"level":"info","time":"2024-04-30T14:00:13Z","message":"Job lima-rancher-desktop-provision-kwasm is still Ongoing"}
2024-04-30T14:00:13.95474801Z stderr F {"level":"info","time":"2024-04-30T14:00:13Z","message":"Job lima-rancher-desktop-provision-kwasm is still Ongoing"}
2024-04-30T14:00:17.702458053Z stderr F {"level":"info","time":"2024-04-30T14:00:17Z","message":"Job lima-rancher-desktop-provision-kwasm is still Ongoing"}
2024-04-30T14:00:17.707615387Z stderr F {"level":"info","time":"2024-04-30T14:00:17Z","message":"Job lima-rancher-desktop-provision-kwasm is still Ongoing"}
2024-04-30T14:00:24.010167056Z stderr F {"level":"info","time":"2024-04-30T14:00:24Z","message":"Job lima-rancher-desktop-provision-kwasm is still Ongoing"}
2024-04-30T14:00:26.643764558Z stderr F {"level":"info","time":"2024-04-30T14:00:26Z","message":"Job lima-rancher-desktop-provision-kwasm is still Ongoing"}
2024-04-30T14:00:26.648906849Z stderr F {"level":"info","time":"2024-04-30T14:00:26Z","message":"Job lima-rancher-desktop-provision-kwasm is Completed. Happy WASMing"
lima-rancher-desktop:/var/log/pods/kwasm_lima-rancher-desktop-provision-kwasm-cs2wf_80b08b4a-fdb9-4ee0-a5d0-e7da89625e38/kwasm-provision$ sudo tail -f 0.log
2024-04-30T14:00:24.459592432Z stdout F No change in containerd/config.toml
it seems latest version as per kwasm-node-installer is indeed v0.11.1
. I will open PR to use latest version in kwasm-node-installer.
having said that, the instructions on https://www.spinkube.dev/docs/spin-operator/tutorials/integrating-with-rancher-desktop/, does refer to a different node-installer image which refers to latest spin-shim version.
@divya-mohan0209, could you please confirm what is the command you used to install the kwasm-operator? or this is the default version that comes with Rancher Desktop?
@divya-mohan0209, could you please confirm what is the command you used to install the kwasm-operator? or this is the default version that comes with Rancher Desktop?
I used the one in the SpinKube docs that you've listed above.
could you pls share the output of
kubectl get pods -n kwasm -o wide
I'll definitely do that once I login tomorrow and the app crashes. I had reset the entire thing for today's live code stream 🤣
Sorry for the delay in getting back! I had to re-do the steps :)
kubectl get pods -n kwasm -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
lima-rancher-desktop-provision-kwasm-mxxfv 0/1 Completed 0 2d11h <none> lima-rancher-desktop <none> <none>
lima-rancher-desktop-provision-kwasm-5j7bt 0/1 Unknown 0 2d11h <none> lima-rancher-desktop <none> <none>
kwasm-operator-6c76c5f94b-hdb2h 1/1 Running 4 (5m1s ago) 2d11h 10.42.0.33 lima-rancher-desktop <none> <none>
I've verified that this issue is caused by the old shim version and is fixed by using 0.14.1: https://github.com/rancher-sandbox/rancher-desktop/issues/6785#issuecomment-2103458271
My comment there also shows how you can upgrade the shim version in Rancher Desktop, which manages shims itself and shouldn't need kwasm at all as long as you use the right RuntimeClass name in your SpinAppExecutor (spin
instead of wasmtime-spin-v2
). I've written about this on Slack at https://cloud-native.slack.com/archives/C06PC7JA1EE/p1714674606796679.
Note that the next release of Rancher Desktop (1.14) will have an option to install spinkube
(and the spin
cli), so none of the manual setup should be necessary anymore (once it is released).
Thank you @jandubois for the confirmation. I just checked our documentation and it looks like we ask the user to install the latest version of Rancher Desktop. We will keep this ticket open until Rancher Desktop 1.14 has been released. We really appreciate you chiming in here and helping us confirm the issue. :)
Context:
I tried out the SpinKube x Rancher Desktop integration detailed on this page. It works seamlessly for the hello-world application detailed there & on the Fermyon blog.
However, when I tried installing some of the other complex templates and containerizing them, such as
or even templates of my own
there is inconsistent behaviour, i.e. they sometimes work and most of the time, they don't.
This is when the spin applications themselves work fine on my machine.
What is the error?
The pods enter the CrashLoopBackOff stage and are terminated with the following message: Last state: Terminated with 137: Error.
Some additional notes
Last state: Terminated with 137: Error - Since this error typically points to an error with memory. I tried increasing the memory assigned to pods and it didn't help.
I tried resetting the Kubernetes cluster and restoring Rancher Desktop to its factory settings (individually, of course). None of those approaches helped and in fact, had the opposite effect. If the templates were working before the factory reset or the cluster reset, they stopped working after. (Of course, I shouldn't have tried to fix what wasn't broken by resetting it 😆 But I did it anyway for the sake of reproducibility)
Lastly, I wasn't sure where the error was, so I filed it against this repo. I'll also open an issue against the Rancher Desktop issues GitHub repo.
Infrastructure details