WebAssembly / WASI

WebAssembly System Interface
Other
4.86k stars 253 forks source link

posix_spawn/CreateProcess API #414

Open programmerjake opened 3 years ago

programmerjake commented 3 years ago

for things that require creating separate processes (e.g. cargo running a bunch of rustc processes to compile code), having a posix_spawn API is critical. Note that this doesn't necessarily map to host-OS processes, multiple processes can be emulated using separate threads or WebWorkers instead.

lygstate commented 3 years ago

We can create an wasi-process for this

  1. Environments should be take care of. On windows, environment key are case insensitive On Linux, environment key are case sensitive.
  2. Command Line argv
  3. PIPE(stdin,stdout,stderr)
  4. cross-process pipe (not on stdout/stdin): used by make/cargo for managing running process count
  5. Shared memory across process
  6. Cross process semaphore/mutex
programmerjake commented 3 years ago

also: cross-process pipe (not on stdout/stdin): used by make/cargo for managing running process count

Artoria2e5 commented 3 years ago

This is likely also relevant for emscripten-core/emscripten#6432 and might make the wapm clang less convoluted. The specific story is that my twitter friend wants to have some Online Judge thing with a client-side pre-check for obviously wrong answers.

supertxtnet commented 1 year ago

This will also make it possible to run command shells, whether interactive, or not.

For a capability-based design I suppose that the path of the executable would need to be mounted into the parent process by the interpreter to be able to spawn it. Also, I assume that the binaries would need to be wasm. The child process should probably inherit the filesystem view of the parent.

pchickey commented 1 year ago

Our current rough thinking on this issue is that the Component Model standard will one day be expanded to support dynamic (runtime) component instantiation. This isn't on any concrete roadmap yet, though, and this year we are focusing on shipping the MVP version of the Component Model, as well as WASI Preview 2 based on top of it.

So, please don't expect this to be available in the near future. We know it is important, and it will be available one day.

supertxtnet commented 1 year ago

If the parent process would like to offer services to the child processes one possibility can be for it to be able to be a file server and the child processes can work with its files using usual file semantics.

A simple example is that the parent process has certain static resources compiled in, such as images, fonts, or any other kind of file. The child processes can read these as needed. More elaborate examples include being able to provide services to the children that interact with the parent process state (e.g. linux /proc filesystem, based on Plan 9 concepts).

This is how it works. The parent process creates an fd, and mounts that into the current filesystem view (namespace). The child processes inherit the filesystem view from the parent including the new mounted files. From there the usual file system syscalls are used to do the IPC, without the need for a network stack or complicated interactions via a pipe.

I think this is a good opportunity to expand beyond some of the limitations of Unix/POSIX, and maybe Plan 9 can serve as a starting point for this with its per-process namespaces, bind and mount semantics.

sunfishcode commented 1 year ago

WASI is definitely aspiring to reach beyond the limitations of Unix/POSIX. But we want to do so in a way that takes advantage of Wasm's unique properties.

One of Wasm's unique properties is that it has an ahead-of-time validation step, with a type system. In the component model, we're adding handle types that allow components to describe their resource needs in the type system. If we build a system where parents can pass handles to their children, then children won't need to inherit anything from their parents in order to be able to resolve the resources passed to them. Having a model where the default for children is to inherit nothing, rather than defaulting to inheriting everything, will make it easier to follow the principle of least privilege.

Plan 9 envisioned that system administrators writing (untyped) shell scripts would be a major audience for the operating system. In WASI, system administration is not in the audience at all; the audience is developers, writing application and library code, in guests and in hosts, in a wide variety of source languages. And for developers, we can generate friendlier and more robust bindings if we have typed interfaces that know which arguments are handles, tuples, lists, numbers, and so on, than we we could for interfaces where everything is strings that are passed around at runtime.

sunfishcode commented 1 year ago

And to be sure, WASI applications and libraries meant to be used by system administrators, to do system administration tasks, are certainly already in the audience today.

supertxtnet commented 1 year ago

Plan 9 envisioned that system administrators writing (untyped) shell scripts would be a major audience for the operating system. In WASI, system administration is not in the audience at all; the audience is developers, writing application and library code, in guests and in hosts, in a wide variety of source languages. And for developers, we can generate friendlier and more robust bindings if we have typed interfaces that know which arguments are handles, tuples, lists, numbers, and so on, than we we could for interfaces where everything is strings that are passed around at runtime.

In a world of devops these worlds are largely the same. So, the question is how do we reconcile this? Files are coming from the outside world as untyped, but WASI allows one to open/create them. Stdin/stdout/stderr are the nearly the same. Will these go away too once the fully typed system arrives? The established conventions are also very strong here. Nearly every programming language will have a built-in API for working with files, file descriptors and file systems.

Plan 9 made the distinction of strong-er typing (at the time) within a program, but weaker typing outside via pipes, files, and filesystems. Type enforcement is there when it comes to the marshaling/unmarshaling of data across process boundaries and within protocols like 9p. It's precisely that kind of flexibility that helps to bridge the devops gap, allowing for looser couplings in certain cases for flexibility and stronger ones in others for criticality. It's the loose couplings that allows one to have greater choices of what to fill the gaps, yes even including various forms of scripting.

So, in this case I think the question is whether we want looser coupling where a parent WASI process can invoke an arbitrary program using files/namespaces as the IPC as more of a process model. Or, do we force absolutely everything in the WASI world into more a thread-like model with shared memory and very strong type consistency checking. Is there room for both process and thread communications concepts here?

Personally, I feel that insisting on strongly typing all interfaces between units of execution can limit versatility. Look at how mainframes (S/360) made every file into rigid records with all sorts of metadata that had to be understood by all parties. Or, look at how difficult CORBA was to work with with its IDL and its own ideas about what a string should be regardless of how unergonomic that was in certain programming languages.

sbc100 commented 1 year ago

Regardless of whether we want to prefer / suggest that folks use more strongly typed things were possible, we do support file handles in WASI, at least for now and it seems like the current design is well suited to giving the parent full control of what it passes to its child.

Once we have some kind of spawn API, it will almost certainly give you full control of the file handles that the child has access to on startup (what the child sees at its pre-opens), so it should be simple spawn things with completely new/isolated views of a filesystem or, if you prefer, pass on the pre-opens that were given to you by your parent module.

sunfishcode commented 1 year ago

@supertxtnet

The Wit IDL we're using is specifically designed to expose strings using the existing source-language string types, so you get a regular string that you can immediately use, rather than a separate string type, like a CORBA string back in the day.

There's more than one way to do loose coupling. Typed interfaces in the Unix and Plan 9 era often meant C header files and C ABIs, which are rigid, and consequently people turned to dynamic typing when they needed flexibility. Wit differs from C header files in many respects, including containing the information needed to automatically generate adapters between things that are close, and makes it easy to write custom adapters when things aren't close. You can get auto-generated high-level typed bindings for both sides and you can jump straight to writing the logic to bridge them. So it approaches loose coupling in a different way.

WASI has typed interfaces, but at the same time, it also doesn't (logically) share memory between components. So it effectively mixes elements of both the "process" and "thread" models as you describe them. Wasm's call stack is trusted and outside the address space, so it can make calls without caller and callee having access to any shared memory. So we aren't trapped by tight coupling of shared memory layouts, and we can still have typed interfaces.