WebAssembly / component-model

Repository for design and specification of the Component Model
Other
930 stars 79 forks source link

Representing function references #117

Closed penzn closed 1 year ago

penzn commented 1 year ago

I've brought this up at WASI meeting on 2022-06-16. This is also partially related to memory sharing, though the most simple case might make do without memory access.

At its core, this isn't a very complicated problem. The gist is how to express interfaces requiring callback functions. For example consider classic sort, which takes two arrays and a function index, the latter taking two elements and returning a signed integer indicating less/equal/greater. I am curious how something like this can be represented in component model.

As an extreme simplification consider sort taking only scalars, or structs that can be represented as scalars. This makes callback representation really simple, but doesn't eliminate sharing memory between caller and sort implementation. I would like to understand what would the representation be in this case and if that is possible to do at the moment. My first question is how to represent this in current state of Component Mode? And what if instead of sort it would be something even more simple, say function that only reads memory, would that make it easier?

However, in a more general case, callback would be aware of the memory, and would take pointer parameters. In case of sorting, this is true for std::sort (with some caveats) and C standard library's qsort. It easy to imagine other uses.

It is important to note that this functionality is expressible within core Wasm syntax and available in existing toolchains. In fact, we even have aforementioned qsort is wasi-libc.

Would use of pointer parameters change representation in component model? Do we expect this to apply to libc as well?

lukewagner commented 1 year ago

The component model allows a component to import a core module that the component internally instantiates and links with its own core client code, allowing it to share memory and pointers. An example of this is worked out in examples/SharedEverythingDynamicLinking.md (see, e.g., the import of libc in the zipper example). This is probably what you'd want for importing a low-level shared-memory libc-style sort function.

In a cross-component (and thus shared-nothing) scenario, fully-general callbacks quickly lead to ref-count cycles that in general require cross-component (cross-language) cycle-/garbage-collection (see: browsers) which, as a general high-level design choice (#3), we've ruled out. future and stream as proposed address many of the use cases for callbacks and bring a clear acyclic ownership story. Another possible extension (not yet proposed, but unproblematic) which would be useful for precisely your kind of sort/map/reduce-with-callback example would be a "scoped" callback (i.e., a callback only available for the duration of a call in which it was passed, thereby avoiding the lifetime cycle).

Given the applicability and appropriateness of core module imports for many of these callback use cases (and the fact that, in a serverless setting, such callbacks aren't what you want anyways), I've been waiting for real use cases for such "scoped" callbacks to emerge on their own before proposing them. (Also, scoped callbacks only recently started to make sense in an async context with the definition of the structured concurrency invariants since otherwise there's no well-defined "scope" for an async call.)

penzn commented 1 year ago

The component model allows a component to import a core module that the component internally instantiates and links with its own core client code, allowing it to share memory and pointers. An example of this is worked out in examples/SharedEverythingDynamicLinking.md (see, e.g., the import of libc in the zipper example). This is probably what you'd want for importing a low-level shared-memory libc-style sort function.

I am curious how syntax for this would look like, for example for libc function qsort? I can try to infer from the zipper example, but not sure how to declare a function pointer parameter (should it be just an int?).

In a cross-component (and thus shared-nothing) scenario, fully-general callbacks quickly lead to ref-count cycles that in general require cross-component (cross-language) cycle-/garbage-collection (see: browsers) which, as a general high-level design choice (#3), we've ruled out.

Just to make sure I understand this correctly, it sounds like direct callbacks would not be allowed in component-to-component use case and solution would be to used streams and futures. Is that right? Also, it seems that core module imports work only in offline mode at the moment. Would that always be the case and came some system modules be core modules?

lukewagner commented 1 year ago

I am curious how syntax for this would look like, for example for libc function qsort? I can try to infer from the zipper example, but not sure how to declare a function pointer parameter (should it be just an int?).

While these examples elide it for brevity, in addition to sharing linear memory, all these core modules would share the funcref table (as just another form of low-level shared mutable state) and thus the function pointer could be passed by its i32 index into the shared funcref table.

Just to make sure I understand this correctly, it sounds like direct callbacks would not be allowed in component-to-component use case and solution would be to used streams and futures.

Initially futures and streams are what we're proposing. But what I was suggesting in the last two paragraphs of my last comment is that, if the real use cases arose for them (perhaps post-MVP), a call-scoped form of callback could work component-to-component and avoid the cyclic-leak problems of fully general callbacks.

Also, it seems that core module imports work only in offline mode at the moment. Would that always be the case and came some system modules be core modules?

I'm not 100% sure this is what you're asking, but if you're asking whether the system can provide core modules as imports to a component running on the system, the answer is "yes" (thereby allowing the system to implement that imported core module natively, in the same way that a system can natively implement function imports today). This would take some implementation work to support, though, and isn't supported yet in, e.g., Wasmtime.

lukewagner commented 1 year ago

I think these questions have been answered, but do feel free to reopen if there are further questions.