Open dhil opened 3 months ago
+1 to @rossberg's previous suggestion to accomplish this using a resume
variant.
I suppose a downside of the resume_barrier
variation is that it requires placing the code on continuation (i.e. another stack), whereas the block variation allows code to run on the main stack.
Thinking ahead to shared suspendable functions, an advantage of a block-scope barrier is that it would allow a shared suspendable function to temporarily hold non-shared references inside the scope of the barrier. It could also allow shared-non-suspendable functions that hold arbitrary non-shared references to be inlined into shared-suspendable callers. This is similar to how try_table
allows callees in its scope to be inlined without issue, but an EH design that instead annotated calls with exception handlers would not permit inlining.
I've also been thinking a little more about this. If we require a "shared-only" handler/barrier only on the shared-nonsuspendable
to shared-suspendable
transition, this would allow shared-suspendable
functions to freely call shared-nonsuspendable
functions and inline the callee. If a shared-suspendable
function wants to open a block scope for transiently manipulating unshared
data, it can be allowed to just open a regular block
with a shared-nonsuspendable
function type, and unshared
data can be manipulated in the body of this block - no handler update or shared-barrier
needed! Note that opening a shared-suspendable
block in a shared-nonsuspendable
function would be disallowed, since in this scenario we're requiring a (call-level) shared-barrier
on such transitions.
I think this is consistent with a world where we don't need a block-scoped barrier
instruction. Just like barrier
could be done purely at the call level, the shared-barrier
is needed only at the call level on a shared-nonsuspendable
to shared-suspendable
transition. The other need for a shared-barrier
(manipulating unshared
data held in some TLS) can be satisfied by a regular shared-nonsuspendable
block, which is probably even more lightweight since you don't need to go around flipping any bits at runtime.
I was under the impression that the problem being addressed by some barrier was one of mixing code: in particular, being able to safely mix code generated by a stack switching aware tool with code written before the existence of stack switching.
Yes, for a plain barrier that has nothing to do with multithreading, that's definitely the primary use case. We're thinking ahead to the multithreaded stack switching use case because in that setting, preventing frames that contain unshared references from being suspended as a shared continuation becomes important for correctness as well, and it would be nice if we could use a similar mechanism in both cases.
Opinion: a barrier instruction (whether its a block instruction or a special resume handler) has a architectural deficiency from the perspective of this use case. The issue is that it requires a code that uses stack switching (the using module) to be aware of accessing modules that are not aware of stack switching (the vulnerable module) but need protection. This 'works' but is not fully reliable. In particular, in the situation where the vulnerable module is actually upgraded, this will require the using module to also be upgraded.
I suppose a downside of the
resume_barrier
variation is that it requires placing the code on continuation (i.e. another stack), whereas the block variation allows code to run on the main stack.
I was thinking about this more, and I realized that if a resume_barrier
does not have any handlers besides the barrier
itself, then there's no need for it to switch to a different stack since it can never be suspended to. Of course taking advantage of this requires the resumed continuation to be invokable without switching stacks, and I don't know how feasible that would be for implementations.
@tlively a fused create and resume instruction would probably help with this (or an analogous special-cased peephole optimisation in the engine).
If you fused cont.new and resume_barrier, you might as well just call the result call_barrier. And then if you wanted to take a page out of the EH playbook to prevent further proliferation of call variants, you might want to make the barrier a block instruction instead.
My understanding is that by avoiding a block barrier
, we avoid needing to carry around/manage an extra mutable flag on each stack (including clearing the flag when the end
of the barrier
block is reached). I'd argue this applies doubly when we consider block-level shared-barrier
, since IIUC this would need a second flag. FWIW I'd be fine with not fusing the instructions unless it's clear we need the optimisation and engine-level peephole work isn't sufficient/acceptable.
Looking at shared-barrier
specifically, I think in the scheme I outlined above there's not much benefit to having a block-level shared-barrier
, because you need at least one "shared
-aware" handler if you want to successfully suspend a shared continuation, and you may as well put the barrier directly on this handler rather than trivially putting it on a separate block.
I suspect that the most sensible decision is to leave barrier
of any kind out of this proposal. That has the happy side effect of reducing design coupling between this proposal and shared-everything threads.
I've removed the barrier
instruction from #84 as we agreed on Monday.
Should we close this issue?
The stack switching proposal notably does not feature the
barrier
instruction from the WasmFX/typed continuations proposal? The question is, do we want to include such an instruction? Do we have a use-case (in mind)? The WasmFX barrier instruction has previously been discussed in issue #44.