Replace `load` with `load-by-name`

abrown commented 1 month ago

This change removes the load function, which generated a graph using some opaque bytes, a graph-encoding enum, and an execution-target. This mechanism allowed WebAssembly guest code (i.e., running within the WebAssembly sandbox) to control when a model is loaded, but by doing so, exposed details that users will likely not need. In FaaS use cases, e.g., user code simply does not have the time to retrieve and load a model for every HTTP request.

This PR proposes instead that users always load models outside the sandbox and then load them by a host-specified name. This is a proposal intended for discussion, not a foregone conclusion, so please provide feedback! If you have a use case that relies directly on users being able to load models via buffers, that would undermine the assumptions of this PR (that no one will use wasi-nn in this way).

But consider the downsides of the current approach: wasi-nn must keep track of an ever growing list of graph encodings and users must somehow "see through" wasi-nn to set up the model buffers. Switching to load-by-name--now called load--would resolve these issues, moving any model and framework details into the host configuration, where they already exist anyways.

hydai commented 1 month ago

Greetings from the WasmEdge team. We prefer to replace the current load with load-by-name. The rationale is that transferring the entire model into Wasm Memory before passing it to the load function is redundant and restricts the model size to a maximum of 4GB. Therefore, we actually use load-by-name as the default method for loading most LLM models under such constraints.

devigned commented 1 month ago

I agree with the replacement of load w/ load-by-name. As we discussed on the most recent office hours, I believe it would be useful to demo a implementation of using load-by-name that included a little more opinion about the string being passed to the function. Using a name for a model that is known by the backend is useful, but I believe there would be more value in passing a URI string to the backend. For example:

oci://example.io/org/my-model:1.0.0
https://example.io/my-model:1.0.0
file://folder/my-model.ext

By using a URI string, I believe the backend could be smart enough to fetch / cache models rather than relying on a shared moniker between the guest and the host.

abrown commented 1 month ago

cc: @shschaefer, @radu-matei, @geekbeast

WebAssembly / wasi-nn

Replace `load` with `load-by-name` #74