spinkube / containerd-shim-spin

A ContainerD shim for running Spin Applications.
Apache License 2.0
57 stars 13 forks source link

Simple Python Spin apps are failing to start #84

Open kate-goldenring opened 5 months ago

kate-goldenring commented 5 months ago

The shim appears to not support python applications.

Repro

spin new pyapp -t http-py
pip install requirements.txt
spin build
spin registry push ttl.sh/spin-py-foo:2h
spin kube scaffold -f ttl.sh/spin-py-foo:2h | k apply -f -
# Pods fail to start

Debugging

sudo ctr pull ttl.sh/spin-py-foo:2h 
sudo RUST_LOG=trace ctr run --rm --net-host --runtime io.containerd.spin.v2 ttl.sh/spin-py-foo:2h py-foo bogus-arg
# process exits

Containerd logs

Fails at failed to build spin trigger

Apr 14 17:26:38 kagold-ThinkPad-X1-Carbon-6th containerd[304436]: time="2024-04-15T00:26:38.246053138Z" level=info msg="starting instance: py-foo"
Apr 14 17:26:38 kagold-ThinkPad-X1-Carbon-6th containerd[304436]: time="2024-04-15T00:26:38.246316099Z" level=info msg="calling start function"
Apr 14 17:26:38 kagold-ThinkPad-X1-Carbon-6th containerd[304436]: time="2024-04-15T00:26:38.246360197Z" level=info msg="setting up wasi"
Apr 14 17:26:38 kagold-ThinkPad-X1-Carbon-6th containerd[304436]: time="2024-04-15T00:26:38.247377987Z" level=info msg=" >>> configuring spin oci application 2"
Apr 14 17:26:38 kagold-ThinkPad-X1-Carbon-6th containerd[304436]: time="2024-04-15T00:26:38.247412943Z" level=info msg="<<< writing wasm artifact with length 41517920 config to cache, near "/.cache/registry/manifests""
Apr 14 17:26:38 kagold-ThinkPad-X1-Carbon-6th containerd[304436]: time="2024-04-15T00:26:38.308110493Z" level=info msg="writing spin oci config to "/spin.json""
Apr 14 17:26:38 kagold-ThinkPad-X1-Carbon-6th containerd[304436]: time="2024-04-15T00:26:38.3874617Z" level=info msg="instantiate_pre;"
Apr 14 17:26:38 kagold-ThinkPad-X1-Carbon-6th containerd[304436]: time="2024-04-15T00:26:38.701446536Z" level=info msg="error running start function: failed to build spin trigger"

And disabling precompilation surfaces better logs:

Apr 14 17:42:19 kagold-ThinkPad-X1-Carbon-6th containerd[304436]: time="2024-04-15T00:42:19.800827224Z" level=error msg="run_wasi ERROR >>>  failed: failed to build spin trigger
Apr 14 17:42:19 kagold-ThinkPad-X1-Carbon-6th containerd[304436]: Caused by:
Apr 14 17:42:19 kagold-ThinkPad-X1-Carbon-6th containerd[304436]:     0: Failed to instantiate component 'mypy'
Apr 14 17:42:19 kagold-ThinkPad-X1-Carbon-6th containerd[304436]:     1: unable to parse binary
Apr 14 17:42:19 kagold-ThinkPad-X1-Carbon-6th containerd[304436]:     2: magic header not detected: bad magic number - expected=[
Apr 14 17:42:19 kagold-ThinkPad-X1-Carbon-6th containerd[304436]:            0x0,
Apr 14 17:42:19 kagold-ThinkPad-X1-Carbon-6th containerd[304436]:            0x61,
Apr 14 17:42:19 kagold-ThinkPad-X1-Carbon-6th containerd[304436]:            0x73,
Apr 14 17:42:19 kagold-ThinkPad-X1-Carbon-6th containerd[304436]:            0x6d,
Apr 14 17:42:19 kagold-ThinkPad-X1-Carbon-6th containerd[304436]:        ] actual=[
Apr 14 17:42:19 kagold-ThinkPad-X1-Carbon-6th containerd[304436]:            0x7f,
Apr 14 17:42:19 kagold-ThinkPad-X1-Carbon-6th containerd[304436]:            0x45,
Apr 14 17:42:19 kagold-ThinkPad-X1-Carbon-6th containerd[304436]:            0x4c,
Apr 14 17:42:19 kagold-ThinkPad-X1-Carbon-6th containerd[304436]:            0x46,
Apr 14 17:42:19 kagold-ThinkPad-X1-Carbon-6th containerd[304436]:        ] (at offset 0x0)"
Apr 14 17:42:19 kagold-ThinkPad-X1-Carbon-6th containerd[304436]: time="2024-04-15T00:42:19.802484297Z" level=info msg="error running start function: failed to build spin trigger"

@dicej are we handling componentization in Spin differently for Python components in a way that we would need to port over to the shim?

dicej commented 5 months ago

@dicej are we handling componentization in Spin differently for Python components in a way that we would need to port over to the shim?

Componentization (i.e. converting a module into a component) doesn't really enter into it in this case -- the binaries generated by componentize-py already are components, so they don't need to be converted. I just ran the following to verify the correct magic number is present:

spin new pyapp -a -t http-py
cd pyapp
python3 -m venv .venv
source ../.venv/bin/activate
pip install -r requirements.txt
spin build
hexdump app.wasm|head

I also validated the file using wasm-tools with no errors:

wasm-tools validate app.wasm

My guess is the shim is looking at the wrong file for some reason. There's certainly nothing Python-specific in Spin, and there's nothing unusal about the components that componentize-py is producing, as demonstrated by wasm-tools validate and spin up.

kate-goldenring commented 5 months ago

I think it may have to do with improperly streaming chunks of the file in these changes. This file is over the 15MB chunk size -- its 15 MB. I'd bet we could reproduce this with other languages with larger file sizes

radu-matei commented 5 months ago

I remember testing ^^ back when it was merged — do you think this is a regression at some point? Or something else.

(regardless of whether it's because of those changes, a test for that would be really good)

kate-goldenring commented 5 months ago

I tested it with a 200 MB js spin app at the time. I am digging more deeply into the logs now

mikkelhegn commented 5 months ago

This might be the case for containerd-shim-spin as well: https://github.com/fermyon/feedback/issues/51#issuecomment-2058639070

I'm able to run an app using the 2.0.0 version of the python SDK using the shim.

kate-goldenring commented 5 months ago

Same, using 2.0 template is a workaround for me too

radu-matei commented 5 months ago

I just tested a Python app with the v3 SDK against a build of the shim that imports Spin v2.4.2 and it works. So once we cut a new release of the shim, this should be fixed.