WebAssembly / component-model

Repository for design and specification of the Component Model
Other
933 stars 79 forks source link

[Question] one instance per component #225

Closed penzn closed 1 year ago

penzn commented 1 year ago

Is it fair to say that multiple components are going to live in multiple instances from core point of view? It seems to be the case from the fact that components share nothing, but I might be missing something obvious.

If it is the case, I am curious what implications this would have from multi-threading point of view, in terms of what would happen when threads (regardless of how that is implemented) would call into components - would that work, and if so how?

lukewagner commented 1 year ago

Yes, that's right: each component instance contains and encapsulates zero or more core instances and each core instance (created in a component-model context) has exactly one containing component instance. You could even go farther and say that each component instance contains and encapsulates one core wasm store.

With threading, the idea is to extend the approach of the thread-spawn proposal by allowing component-level functions to be marked as shared (e.g., in Wit it might look like foo: shared func(s: string) -> string) which lifting and lowering would preserve (so, e.g., lowering a shared component import would produce a shared core function and lifting a shared component export would take a shared core function). Thus, if your component doesn't export a shared function, your component can rely on only ever being called sequentially (preserving the default assumption we have today and making the much trickier task of implementing an externally-multi-threaded component opt-in at the public interface level).

However, even before we add shared to component-level imports/exports, the core wasm inside a component could still do internal threading by spawning threads with the core thread.spawn instruction. Such threads could only call other shared core functions and imports (according to the proposed core wasm rules) and thus internally-threaded execution would need to serialize itself onto the entry thread (i.e., the thread that the component's unshared export was initially called on) to call an unshared component import.

Hope that helps, happy to answer any more questions.

penzn commented 1 year ago

Yes, that's right: each component instance contains and encapsulates zero or more core instances and each core instance (created in a component-model context) has exactly one containing component instance.

Thanks! That is what I thought.

Thank you for the further explanation as well. If I understand it correctly, you mean how to make a component accessible from multiple threads in parallel and how to make internally-threaded components, given the support, right? What I am curious about is this scheme, where threads in a component would call other components:

  A (threaded)
  |
+-+-+-...-+
|   |     |
N   M ... X

These dependencies can be just instances of the same thing or different (what's important is that they are different from initiating component). Do they need to be thread-safe? Are there any extra consideration for communication between As threads and the other components?

lukewagner commented 1 year ago

If I'm understanding your example right (let me know if not), A's entry thread (the thread used to call A's unshared exports) can call the unshared exports of N, M ... X, but all of A's internally-created threads (which are necessarily executing shared core functions) would only be able to call the shared exports of N, M ... X (as declared in those component's export types and as required by A's import types). And there would be one component instance for each of N, M ... X that is imported by the instance of A. Thus the instance DAG is 1:1 with your diagram and stays fixed, regardless of how many threads are created or cross-component calls happen.

penzn commented 1 year ago

A's entry thread (the thread used to call A's unshared exports) can call the unshared exports of N, M ... X, but all of A's internally-created threads (which are necessarily executing shared core functions) would only be able to call the shared exports of N, M ... X (as declared in those component's export types and as required by A's import types).

Awesome, thank you for explaining. If two threads call shared export of the same component, would this lead to one instance or multiple instance of that (sorry, the graph doesn't distinguish that)?

(Edit) One concern I have with this, which is somewhat orthogonal to threading is the cost of trying to talk to multiple instances. Let's say a library of math primitives is a component: maybe passing a couple of values to and from is fast, and maybe handing off a sufficiently large array to it is fast too, but there might be a performance swamp somewhere in the middle where cost of communication becomes comparable to the cost of operation.

lukewagner commented 1 year ago

If two threads call shared export of the same component, would this lead to one instance or multiple instance of that (sorry, the graph doesn't distinguish that)?

Just one instance (in the thread-spawn world with "shared", threads never create new instances, either at the core wasm level or at the component level).

One concern I have with this, which is somewhat orthogonal to threading is the cost of trying to talk to multiple instances.

Maybe you have a more specific situation in mind but, just speaking generally to this: the situations in which one would call across component instance boundaries roughly corresponds to the situations where today you'd call across process boundaries or through a native extension interface (like JNI), which are already have significant or much greater overhead. Note that same-language calls (e.g., across library or package or dlls boundaries) are expected to compile to intra-component calls, using either static linking of the core module or shared-everything dynamic linking of multiple core modules (just like we do in wasm today).

penzn commented 1 year ago

Thank you! Static linking via piqued my curiosity a little further, hence #239