WebAssembly / component-model

Repository for design and specification of the Component Model
Other
972 stars 82 forks source link

Async and streamable data segments #138

Open ricochet opened 1 year ago

ricochet commented 1 year ago

How should components bundle static assets? We have seen several early incantations of this for WebAssembly modules, e.g. emscripten file systems. Many languages have a concept of embedding data, e.g. go embed, so support for this type of use-case will be needed. The question then is if there is a way to make this a component-first interface.

With a component, we need a way to make individual component data (potentially of a nested component) available to running wasm code, or even surfaced via the asset references proposal in JS.

There are a few key properties from the JS asset references proposal that have strong parallels and are worth highlighting:

I'm going to quote @lukewagner here since no one says it better. Additional discussion context here:

What I'd like to see here is that we can leverage component model's linking support to build a language-independent tool that can virtualize a WASI filesystem in terms of data sections (and possibly other non-filesystem interfaces, e.g., a blob store interface).

If core modules have data imports and data sections are added to components, then contents of the data sections may be used to bundle static assets. This allows external sections to be implemented as a virtual filesystem for a component.

Use case

Given a component that implements a simple CRUD interface for a very large ML model, as a host I want to be able to validate and compile a component before data segments have been downloaded.

Snip from wasi-nn interface:

// The graph initialization data.
//
// This consists of an array of buffers because implementing backends may encode their graph IR in
// parts (e.g., OpenVINO stores its IR and weights separately).
type graph-builder = list<u8>
type graph-builder-array = list<graph-builder>

load: func(builder: graph-builder-array, encoding: graph-encoding, target: execution-target) -> expected<graph, error>

Rust source for our component might look like:

let xml = fs::async_stream("fixture/model.xml").unwrap();
let weights = fs::async_stream("fixture/model.bin").unwrap();

let graph = unsafe {
  wasi_nn::load(
      &[&xml.into_bytes(), &weights],
      wasi_nn::GRAPH_ENCODING_OPENVINO,
      wasi_nn::EXECUTION_TARGET_CPU,
  )
  .unwrap()
};
guybedford commented 1 year ago

Thanks for the great summary here, it's great to pick up on these important discussions again. The pieces seem to fit together quite nicely, and I really like the encapsulated approach in components alongside streaming.

Then I suppose one missing piece remaining there is how the lazy loading behaviour is specified in hosts, since in optimized loading one would likely want host-specific mechanisms for progressive delivery, even if the data is fully streamed. For example, just about everything in the JS module system pretty much specifies the network loading has already completed, unless the JS integration of components somehow defines its own lazy network layering internally.

So asset references may well still come in useful here, since they capture the concept of a resource in an opaque data import that has not yet been loaded, while integrating into the module system, working as a build hint and abstracting the URL baggage. Imported data segments as asset imports to create that binding to the data segment being lazily loaded then still seems like an interesting approach. Effectively just specifying that a data segment that is imported must be lazily fetched where possible in whatever host-defined way makes sense. In that case, this work seems to depend on https://github.com/WebAssembly/bulk-memory-operations/issues/15. There may well be other ways to define the lazy layering though, happy to discuss further.

ricochet commented 1 year ago

In the case of a VFS capability provider component, it would be the piece that handles the loading of assets for the host. If it can index by named reference of the file path, then it could load each data segment lazily at the time it is needed. Specifically the provider aka the implementer of the hypothetical fs::async_stream, would await the response when the component needs to import the asset.

Only slightly related: I heard someone suggest using a tool like wizer to optimize a composed component to orchestrate loading of assets. At a high-level a tool that executes my component and builds a list of assets that were needed in runtime order, then produces an optimized component with logic to kick off background loading before the assets are needed at instantiation time would be a nice feature.

lukewagner commented 1 year ago

Thanks for the really clear write up and agreed with the motivation and use cases!

I think passive data segments as-is in core wasm get us most, but not all the way there: the key missing piece is how to enable the host to asynchronously and lazily bring the assets to the running component on-demand in the simplest and most portable way. The nice property about data segments that I'd like to preserve is that they are fully encapsulated by a component and add zero additional imports/exports; you essentially get the functionality from a core wasm implementation "for free". And already, hosts can (and some do) implement data segments by, at runtime, putting the data segments in host files and implementing memory.init $dataSegment in terms of file operations (e.g. mmap(datafd, MAP_PRIVATE, linear_memroy)). But memory.init is a synchronous operation and thus the file operations don't allow other things like network I/O to happen concurrently.

So here's one sketch of an approach that preserves these qualities:

  1. Allow core:data sections directly in components (analogous to how core:type sections can be directly in components). Thus, components would gain a core:data sort and a core:data index space populated by these component-level data segments.

  2. Building on stream, add a canon data.read built-in that has a dataidx immediate (the static index of a data segment) and is importable by core wasm with core function type (func (param $start i32) (param $len i32) (result $streamidx i32)), where the returned $streamidx is an index into the table of futures/streams and can subsequently be used by all the other stream built-ins that allow async bulk streaming reads directly into linear memory.

With this approach: