Open sunfishcode opened 5 years ago
You're proposing removing proc_exit
? I get that the "proc" prefix is problematic, but there should be some way of exiting early from a wasi instance in my opinion. Perhaps name it instance_abort
?
You're proposing removing
proc_exit
? I get that the "proc" prefix is problematic, but there should be some way of exiting early from a wasi instance in my opinion. Perhaps name itinstance_abort
?
proposing renaming to store_exit
Why "store"? What does that mean in this context?
Edit: followed the link on the op, I understand what "store" means here, but it's not a widely used word. Most people would use "instance" I think.
it will be more widely used if we use it :)
I think "instance" would be more clear than "store" unless there is some meaningful difference between them in this context. I'm very familiar with the sense of store used in the spec, but its still not the sense that would come to mind if I saw it in this context.
"Instance" isn't what I was picturing here, because multiple (module) instances can be linked together and it'd be messy if one disappeared while others are still live. Using "store" would avoid that problem by ensuring that everything linked together is terminated together.
I see what you mean, Dan. Still, the "store" keyword doesn't imply the right context to me, perhaps there's another word that'd be better? I'd like to hear other's thoughts.
Maybe the problem is that the store
is the runtime global state for the VM, which doesn't really include execution. So exiting the store seems like the wrong terminology -- I'd think of "closing" or "destroying" it instead, perhaps. Maybe something like vm
is better here? In my mind that encompasses the store and execution.
the wasm c api uses "store" as well: https://github.com/WebAssembly/wasm-c-api/blob/master/include/wasm.h
Maybe I'm an outlier, but I don't really understand what "store" represents in the context of WebAssembly. The spec says
The store represents all global state that can be manipulated by WebAssembly programs.
which is a bit vague. The thing is, you don't need to worry about "all" when you're trying to access the data you need, but when it's the thing you're about to delete, defining "all" becomes rather important.
For instance, if you call WebAssembly.instantiate()
on the same module several times, and call exit()
in your cpp code in one instance, do all instances exit? This might create some nasty surprises for CDNs trying to save resources by sharing modules between instances.
"Instance" isn't what I was picturing here, because multiple (module) instances can be linked together and it'd be messy if one disappeared while others are still live.
I think that reasoning is an artifact of thinking in terms of processes, where there are chunks of running code that are strongly decoupled from the outside world, have strong boundaries with each other and can be assumed to hold invariants within themselves. When these boundaries exist, then dropping everything inside them when an error occurs is a natural design path.
Given the direction WASI+WASM is heading, I don't think these assumptions hold anymore, and exit
should be designed with that in mind.
It is called "store" in the WASM spec because the term is commonly found in the literature when describing the semantics of a language. http://matt.might.net/articles/cesk-machines/ is a pretty accessible example but there are tons of papers out there that use it.
There are ways that WebAssembly.instantiate()
or its moral equivalent could create instances that import resources (functions, tables, memories etc) from other instances, in which case they share a common store, and all of those linked instances would have to terminate when any one of them terminates. (This is not super obvious, I forgot about this case when I made my previous comment).
The Module itself is a separate concept from the store, and there would be no interference between instances of the same module that do not share a store.
I think @binji's comment points to underlying problem here -- thinking in terms of VM objects first. An alternative is to define proc_exit
in terms of unwinding the calling thread back to its entrypoint first. With threads, perhaps the VM should also remember which threads are part of the group, so that they an be asynchronously unwound back to their entrypoints when exit
is called too. In a Reactor-style program, perhaps the VM could unregister the program's event handlers at that point too.
Unwinding a program's threads back to their entrypoints and unregistering them from callbacks may imply that instances or perhaps even the whole store could be terminated under some circumstances, such as when the store only contains instances loaded for that program. (Some embeddings could still simply call the host exit
, for example.)
There are still details to work out, but thinking in terms of unwinding threads and unregistering callbacks first feels like a better starting point.
Yes, this is what I was getting at.
What would unwinding a thread involve? Would it be equivalent to throwing an exception inside that thread, where you call destructors and finally
blocks for each stack frame? Or just wasm-level bookkeeping, eg closing file descriptors and garbage-collecting references?
As an update here, proc_raise
and the process-oriented clocks have now been removed. That still leaves exit
, though as I mentioned above, once wasm has unwinding support, that may make sense as a way to define what "exiting" means.
@sunfishcode In that case, would throwing a specific exit_code
struct, which would contain the exit code as well as more context perhaps, to indicate a successful or unsuccessful exit be just as useful as including an exit
function?
@lachlansneff Yes, with unwinding in the language, WASI wouldn't necessarily need to define an exit
function. It might instead just define an exception type that programs would import and throw, so that the exception doesn't get accidentally caught by unsuspecting catch blocks.
Hello @sunfishcode So, is the conclusion here that the semantics of proc_exit
is not to destroy/close the store as suggested by @binji, but to do unwinding as wasmtime is currently doing with Trap?
The semantics of proc_exit
being just an unwinding also means that WASI reactors and WASI commands with exported functions other than _start
may continue to run even after proc_exit
, since the host environment may continue to call their exported functions even after observing the unwinding by proc_exit
. Is this understanding correct?
I'm currently working on WASI implementation of our wasm runtime and want to clarify the current semantics of proc_exit
.
I believe the instance should never be considered usable after a call to proc_exit. This is why Trap seems like a reasonable implementation.
With WASI commands its very obvious because the instance gets thrown away on each call.. and there is no state maintained between calls.
Remember that most calls to proc_exit
come from C exit()
which means that by the time it gets called static destructors have been called.
Thanks for your quick response, Sam!
I believe the instance should never be considered usable after a call to proc_exit.
Thanks for the clarification. So, proc_exit
is not only doing unwinding, but the instance should be considered unusable after that. Yes, it gets clearer to me now. Then, my second paragraph was false. and maybe the semantics of proc_exit
is also close to destroying the store, although the nuance is different.
Thanks!
WASI has no concept of a "process", and the current WASI Core has had most parts that depend on this concept removed already, but a few bits remain:
__WASI_CLOCK_PROCESS_CPUTIME_ID
,proc_exit
, andproc_raise
.For
proc_exit
and__WASI_CLOCK_PROCESS_CPUTIME_ID
, one option is to change them from "process" to "store". Implementations which don't wish to support__WASI_CLOCK_PROCESS_CPUTIME_ID
may return an error code from__wasi_clock_time_get
,__WASI_EINVAL
, instead.proc_raise
probably makes sense to remove regardless, since there is no other support for signal handling.For reference, in wasi-sysroot,
__WASI_CLOCK_PROCESS_CPUTIME_ID
is used in C functionsclock
(C standard),times
(POSIX), andgetrusage
(POSIX).