WebAssembly / shared-everything-threads

A draft proposal for spawning threads in WebAssembly
Other
34 stars 1 forks source link

"Solving" TLS by doing nothing #70

Open tlively opened 3 months ago

tlively commented 3 months ago

I've recently had a couple chats with the V8 team about TLS, and they're rightfully concerned about the necessary complexity, although not to the point of ruling anything out yet. They also made the astute observations that 1) shared functions are the root of all the complexity, as others have already pointed out, and that 2) implementing TLS in the engine ends up looking very much like implementing our current instance-per-thread model in the engine instead of user space.

This also led to the observation that at least for the multithreaded WasmGC use case, we could easily get away without shared functions by using instance-per-thread for functions and vtables and using vtable indices instead of vtable references in object instances. (Dart is in fact already using this strategy today.) That would also give us TLS "for free" in the same way we get TLS for free today, where each thread can simply have its own non-shared globals.

Ignoring shared functions for now is an excellent way to scope down complexity for prototypes and start trying to prove incremental value in the short term, but it leaves a couple problems unaddressed.

First, without shared functions, there can be no shared continuations. This would be very sad, especially for Kotlin, where shared continuations are extremely common at the source level. It's probably not a showstopper for them, though, since they can keep lowering the continuations to normal control flow as they do today.

Second, discarding shared functions and thread-local function imports does nothing to address the problem of Web engines having to support shared-to-unshared references to implement thread-bound data. We might be able to provide some short-term, incremental value without thread-bound data, but my hypothesis remains that we will need it in the long run. If that turns out to be true, then the marginal complexity of supporting shared functions and TLS, including thread-local functions, will be significantly reduced, although the performance questions will remain.

Finally, the non-Web folks who are primarily interested in this proposal so they can have threads without the instance-per-thread model would obviously be unhappy if we continued to require instance-per-thread.

So all in all, I don't think discarding shared functions is going to be the best long-term design, but it will make prototyping much simpler in the near future. We should certainly continue discussing how to effectively support shared functions, TLS, and shared continuations because those remain the biggest open questions we have for the long term.

conrad-watt commented 3 months ago

Just to understand explicitly, would deferring shared functions mean that only shared GC structs and struct-typed shared globals and tables would be on the table for immediate prototyping? IIUC this is the part of the proposal that is the lowest priority for other stakeholders (WASI/component model). We can split the proposal, but I'd still expect to see a lot of interest in finding a way forward with shared functions.

We may be in an uncomfortable situation where incentives are misaligned, since (out of the features of this proposal) shared functions are a relative priority for non-Web stakeholders, but Web stakeholders bring the most technical constraints. Is there anyone on the Web side who is currently urgently asking for shared functions? I remember that Adobe mentioned at the recent in-person meeting that one of the key difficulties with using wasm-split was its incompatibility with threads, and it seems like we'd need to support shared functions to fix this.

sbc100 commented 3 months ago

Would not having shared functions also mean that shared tables (i.e. the indirect function table) would also not work? From the C/C++ POV if we don't get shared tables this doesn't really solve the problems we face.

conrad-watt commented 3 months ago

Would not having shared functions also mean that shared tables (i.e. the indirect function table) would also not work? From the C/C++ POV if we don't get shared tables this doesn't really solve the problems we face.

IIUC without shared functions, shared tables would only be allowed to contain shared GC objects/structs (which themselves can't contain function references). We'd essentially be maintaining the status quo for non-GC languages (and bringing GC languages into the current non-GC status quo for threading).

tlively commented 3 months ago

Just to understand explicitly, would deferring shared functions mean that only shared GC structs and struct-typed shared globals and tables would be on the table for immediate prototyping?

And arrays and other non-function shared reference types.

We can split the proposal, but I'd still expect to see a lot of interest in finding a way forward with shared functions.

To be clear, I don't think it makes sense to split the proposal until some part is ready for useful phase advancement and is being held back by the rest of it. I don't think we're there yet and I don't anticipate being there for a while. I'm also still very interested in finding a way forward with shared functions.

Is there anyone on the Web side who is currently urgently asking for shared functions?

I wouldn't say urgently, which is good since anyone who needs them urgently is plumb out of luck, but certainly there are folks who would benefit from them, as you mentioned. I think it's more that this proposal has lower-hanging fruit for prototypes to prove incremental value. As more of the infrastructure is put in place, I would expect the cost side of the cost-benefit analysis for shared functions to come down, too.

conrad-watt commented 3 months ago

One point we should also explicitly think about - if we go down the route of having separate shared-nonsuspendable and shared-suspendable, by starting with just shared-nonsuspendable for now we could go ahead with the native "TLS through function parameters" approach in the short term.