Problem when trying to use Runwasi + WasmEdge over Kind and Docker Image

Hi,

I've had a problem for several days trying to change the Kubernetes runtime to runwasi: WasmEdge.

I am trying to orchestrate the module llama-simple.wasm (https://github.com/second-state/llamaedge/releases/latest/download/llama-simple.wasm) for which I know that it is necessary to previously use the --nn-preload flag to indicate the LLM to execute.

The problem is that when I modify the runtime of containerd to run with wasmedge I always get the same error: kubelet Back-off restarting failed container...

imagen

I have tried to use the runtime of this other runwasi repository (https://github.com/second-state/runwasi), without success, giving me a different error in this case:

imagen

I don't know if anyone has tried to run a .wasm module that requires wasi-nn to work that can guide me to see what I'm doing wrong.

I test a simple HTTP server on RUST and it works perfect, so the problem is only with the modules that requires wasi-nn and the flag of --nn-preload to work:

imagen

If anyone can help me I appreciate it!.

Regrats, David.

Running the image using docker might give you better insight on what went wrong. You might also want to check the containerd logs.

However, in this case that's expected. The versions of the shim that we release do not support plugins with the wasmedge runtime. To support plugins you would need to:

build the shim against glibc (we do release glibc builds)
link against wasmedge dynamically (we don't currently have releases like this)
the wasi-nn plugin must be available in the host system (and be compatible with the wasmedge version used in the shim, currently 0.13)

It is possible, but not something we currently support as it comes with some downsides, like making distribution more complex.

Another option would be linking the wasi-nn plugin statically, but that would require some non-trivial changes on the wasmedge codebase. @hydai could wasmedge provide a way of statically linking plugins with the runtime? e.g., a way of registering a plugin that is statically linked to the binary. It's not the first time this comes up, so it makes me think this might actually be useful.

Another consideration is that the wasi-nn workflows would be running using CPU rather than GPU, although we are working on that (see this AI_dev presentation)

Hi @jprendes

We are looking for a solution to register a statically linked plugin. However, it is a challenging task at the current stage.

Some WASI-NN backends are released frequently; for example, llama.cpp provides daily builds, adding features rapidly. Providing a statically linked binary might not be ideal, especially as it requires a daily bump of the version and release of the binary. Also, we provide various hardware versions for the same WASI-NN backends if we do not use WebGPU. Take the llama.cpp backend as an example; we provide at least four versions: pure CPU, CUDA11(Linux/Windows), CUDA12(Linux/Windows), and Metal (Mac). The burn.rs may have the same issue if we want to avoid creating a unified single binary supporting CPU and WebGPU.
Introducing support for this feature will necessitate a significant shift in how we handle the plugin, constituting a major change. It's important to note that implementing this feature will take time.
How many WASI-NN backends would you like to support in the runwasi+wasi-nn matrix if we finally achieve the above feature? Should we aim to support all of them, and if so, how should we handle the multiple statically linked binaries(wasmedge runtime+plugins) for user configuration? Would it be the number of runwasi(multiple architectures) times the number of wasmedge(multiple WASI-NN backends)?

Hi @dvperez-grad

As it stands, the current runwasi is unable to utilize WasmEdge plugins. To address this, you will likely need to construct the shim using the steps provided by Jorge.

Unfortunately, WasmEdge doesn't provide a document to leverage runwasi+WasmEdge for the LLM workload at this moment. Let me loop @CaptainVincent in. He may know how to use runwasi+WasmEdge to run the llama workload. I believe the steps will be similar to what we do for the crun+WasmEdge integration and it should work with k8s smoothly.

IIUC Wasmedge has two plugin registration api, one for C++ and one for C. Adapting the C API to register statically linked "plugins" looks straight forward. The Rust plugins use the C API, so it shouldn't be that hard, I'm happy to give it a go. I haven't gone too deep into the C++ API, so I can't comment there.

This functionality doesn't mean that wasmedge has to ship static (and versioned) wasi-nn plugins every day. But it would allow runwasi to include builtin plugins. We can build the statically linked plugin before each release of runwasi (at least on the docker side).

As to how it would work long term, I would like to see a webgpu WIT, and a wasmedge plugin that implement it, then llama.cpp / burn.rs can be compiled to wasm using the webgpu bindings. What version of those libraries to use would be deferred to the container developer instead of being hardcoded in the shim.

IIUC Wasmedge has two plugin registration api, one for C++ and one for C. Adapting the C API to register statically linked "plugins" looks straight forward. The Rust plugins use the C API, so it shouldn't be that hard, I'm happy to give it a go. I haven't gone too deep into the C++ API, so I can't comment there.

We can focus on the C API because the C and Rust plugins are using this.

This functionality doesn't mean that wasmedge has to ship static (and versioned) wasi-nn plugins every day. But it would allow runwasi to include builtin plugins. We can build the statically linked plugin before each release of runwasi (at least on the docker side).

I see. It provides an alternative way for users to choose whether to use a dynamically linked or statically linked one. This makes sense.

Hi again,

Thanks all of you for the fast answer.

Hi @jprendes:

Running the image using docker might give you better insight on what went wrong.

Its imposible to run the image in docker because the entrypoint and the image i use:

FROM scratch
COPY llama-simple.wasm /llama-simple.wasm
ENTRYPOINT [ "/llama-simple.wasm" ]

Anyways, i try to use Ubuntu and Debian images and try to get a Shell, but the problem isn´t the module .wasm because if you install natively WasmEdge with this comand:

curl -sSf https://raw.githubusercontent.com/WasmEdge/WasmEdge/master/utils/install.sh | sh -s -- -p /usr/local --plugins wasi_nn-ggml

All works perfect in the Docker image, because WasmEdge its instaled with the WasiNN plugin.

You might also want to check the containerd logs.

All the logs of the Kubernetes pod are empty.

When i use "kubectl logs llama-simple-app" nothing appears on the screen.

About containerd logs, inside wasm-control-plane of Kind and using "journalctl -u containerd | grep wasmedge" i get this:

Jul 18 06:29:01 wasm-control-plane containerd[120]: time="2024-07-18T06:29:01.842454320Z" level=info msg="Start cri plugin with config {PluginConfig:{ContainerdConfig:{Snapshotter:overlayfs DefaultRuntimeName:runc DefaultRuntime:{Type: Path: Engine: PodAnnotations:[] ContainerAnnotations:[] Root: Options:map[] PrivilegedWithoutHostDevices:false PrivilegedWithoutHostDevicesAllDevicesAllowed:false BaseRuntimeSpec: NetworkPluginConfDir: NetworkPluginMaxConfNum:0 Snapshotter: SandboxMode:} UntrustedWorkloadRuntime:{Type: Path: Engine: PodAnnotations:[] ContainerAnnotations:[] Root: Options:map[] PrivilegedWithoutHostDevices:false PrivilegedWithoutHostDevicesAllDevicesAllowed:false BaseRuntimeSpec: NetworkPluginConfDir: NetworkPluginMaxConfNum:0 Snapshotter: SandboxMode:} Runtimes:map[runc:{Type:io.containerd.runc.v2 Path: Engine: PodAnnotations:[] ContainerAnnotations:[] Root: Options:map[SystemdCgroup:true] PrivilegedWithoutHostDevices:false PrivilegedWithoutHostDevicesAllDevicesAllowed:false BaseRuntimeSpec:/etc/containerd/cri-base.json NetworkPluginConfDir: NetworkPluginMaxConfNum:0 Snapshotter: SandboxMode:podsandbox} test-handler:{Type:io.containerd.runc.v2 Path: Engine: PodAnnotations:[] ContainerAnnotations:[] Root: Options:map[SystemdCgroup:true] PrivilegedWithoutHostDevices:false PrivilegedWithoutHostDevicesAllDevicesAllowed:false BaseRuntimeSpec:/etc/containerd/cri-base.json NetworkPluginConfDir: NetworkPluginMaxConfNum:0 Snapshotter: SandboxMode:podsandbox} wasmedge:{Type:io.containerd.wasmedge.v1 Path: Engine: PodAnnotations:[] ContainerAnnotations:[] Root: Options:map[] PrivilegedWithoutHostDevices:false PrivilegedWithoutHostDevicesAllDevicesAllowed:false BaseRuntimeSpec: NetworkPluginConfDir: NetworkPluginMaxConfNum:0 Snapshotter: SandboxMode:podsandbox}] NoPivot:false DisableSnapshotAnnotations:true DiscardUnpackedLayers:true IgnoreBlockIONotEnabledErrors:false IgnoreRdtNotEnabledErrors:false} CniConfig:{NetworkPluginBinDir:/opt/cni/bin NetworkPluginConfDir:/etc/cni/net.d NetworkPluginMaxConfNum:1 NetworkPluginSetupSerially:false NetworkPluginConfTemplate: IPPreference:} Registry:{ConfigPath: Mirrors:map[] Configs:map[] Auths:map[] Headers:map[]} ImageDecryption:{KeyModel:node} DisableTCPService:true StreamServerAddress:127.0.0.1 StreamServerPort:0 StreamIdleTimeout:4h0m0s EnableSelinux:false SelinuxCategoryRange:1024 SandboxImage:registry.k8s.io/pause:3.7 StatsCollectPeriod:10 SystemdCgroup:false EnableTLSStreaming:false X509KeyPairStreaming:{TLSCertFile: TLSKeyFile:} MaxContainerLogLineSize:16384 DisableCgroup:false DisableApparmor:false RestrictOOMScoreAdj:false MaxConcurrentDownloads:3 DisableProcMount:false UnsetSeccompProfile: TolerateMissingHugetlbController:true DisableHugetlbController:true DeviceOwnershipFromSecurityContext:false IgnoreImageDefinedVolumes:false NetNSMountsUnderStateDir:false EnableUnprivilegedPorts:false EnableUnprivilegedICMP:false EnableCDI:false CDISpecDirs:[/etc/cdi /var/run/cdi] ImagePullProgressTimeout:5m0s DrainExecSyncIOTimeout:0s ImagePullWithSyncFs:false IgnoreDeprecationWarnings:[]} ContainerdRootDir:/var/lib/containerd ContainerdEndpoint:/run/containerd/containerd.sock RootDir:/var/lib/containerd/io.containerd.grpc.v1.cri StateDir:/run/containerd/io.containerd.grpc.v1.cri}

For now i will try to do this:

build the shim against glibc (we do release glibc builds) link against wasmedge dynamically (we don't currently have releases like this) the wasi-nn plugin must be available in the host system (and be compatible with the wasmedge version used in the shim, currently 0.13)

And see the links @hydai provided me.

If i make any advance i will tell it here.

I also send an email to @CaptainVincent asking him for help when i use the forked version of runwasi that he provides.

Thanks for all the info, Regrats, David.

containerd / runwasi

Problem when trying to use Runwasi + WasmEdge over Kind and Docker Image #647