WebAssembly / design

WebAssembly Design Documents
http://webassembly.org
Apache License 2.0
11.41k stars 694 forks source link

Proposal: Module Execution Hints #1448

Open Snapstromegon opened 2 years ago

Snapstromegon commented 2 years ago

Abstract

The fundamental idea is to provide the abillity for runtimes to execute a module on more specialized hardware or with different tooling for higher performance. To achieve this, a module might contain "execution hints" in some form. To keep compatibility, it's always required that a runtime can ignore such hints and still be able to run the module.

Relation to other proposals

This proposal is related to, but not the same as #273 and #1050. It also depends on the Module Linking Proposal, as that's required for this to make sense.

Problem

Many systems executing wasm have specialized hardware to run code with hardware acceleration (e.g. GPUs or FPGAs). WASM is currently not taking advantage of this. There do/did exist some proposals trying to resolve this, but they either broke compatibility or specialized exactly on one type of hardware.

Proposal

Allow WASM modules to include "Module Execution Hints" which provide the executing environment "hints" on how to speed up module executions.

What is a "execution hint"?

A "hint" is a marker in a module to signal that the module benefits from a special execution environment and might have in turn reduced its instruction set for compatibility. Without thinking about it too much, I would've chosen one (short) string as a marker. It might be reasonable to allow specifying multiple markers, although I did not spend much time thinking about that.

Are "execution hints" standardized?

No.

It might be good to add some "recommended execution hints" like "GPU" or "ML", but in general the spec shouldn't require hints to be present, so it's more extensible in the future. Also the compatibility features of this proposal allow for "misinterpretation" of execution hints.

Why add the marker to a module and not e.g. a function?

Although markers don't have to mean that a module should be executed on different hardware, I see one of the main benefits in using them that way. Since such hardware often has different memory than the rest of the execution and a module has its own memory which is more easily seperated, it's just easier to extract a whole module. Also switching toolchains (which might be required) is probably easier to implement that way.

What if a "hint" isn't available or unknown?

An "execution hint" is not required to be honored by the execution environment. Using a hint doesn't allow the module to start using previously incompatible features (e.g. a hint "big_ints" doesn't allow you to start using i128 or f128). A module with execution hints should always result in the same execution when the hints are ignored as when they are used. That way an environment can choose to ignore them without breaking the module (a module should not [need to] know wether or not "execution hints" are active).

Where are module hints placed?

Module hints should come early in the wasm file. In fact so early, that it's still cheap to switch the toolchain processing the binary, so the "happy path" is as fast as it can be.

How can "execution hints" increase performance?

I personally believe that WASM is generic enough, that it's possible to run a lot of wasm modules on specialized hardware, especially if you reduce the possible instructions to match the hardware.

If e.g. a game engine knows that some code is probably best executed on a GPU, it might add a marker for this to some specific modules, so the application can take advantage of hardware acceleration, if possible.

Even more important, if you have same special FPGAs, which might be really fast, but only understand i32/f32, you might add an "execution hint" for some module to run on said FPGA. As the module author, you also restrict yourself (as a contract between the author and the execution environment) to not use i64/f64. If the author uses them anyways, the module should fall back to "normal" execution without the hints.

So how could a module with execution hints work?

meh drawio

Considerations

Compilation Performance

If an environment starts compiling a large wasm module with the assumption of supporting an "execution hint" it found, but then needs to downgrade at some point, the time spent compiling the code up to now is wasted and probably needs to be done again, which will impact the execution time and maybe also memory (since the wasm binary can't be compiled streamingly while throwing away the already compiled parts, but those have to be kept for possible downgrades).

Privacy

Exposing the availablilty of a GPU might not be much, but exposing e.g. FPGAs might allow for fingerprinting on the web, so especially browser environments might want to not support this feature or only with really reduced feature sets.

Included modules

If Module A imports Module B which imports Module C and only Module B has a marker for e.g. GPU execution, it might be actually faster to just execute everything in the "default execution flow" instead of moving the execution between modules and hardware.

Why not standardize a list of "execution hints"?

Standardizing a set of execution hints will probably

Snapstromegon commented 2 years ago

Hi all, this is my first contribution here and I'm not completely sure if this might have already been discussed elsewhere, but I had this in my mind and found nothing similar.

I know that this is a huge proposal and that there are other things with higher priority right now in this space, nonetheless I wanted to bring this to the attention of the community, even if the only thing that comes from this is me learning something new.

tlively commented 2 years ago

Thanks for the write up! Other hinting proposals such as branch hinting and tracing are using a standard framework for attaching hints to functions and instructions, which is a little different from your proposal to attach hints to full modules.

Are there any Wasm engines that execute on specialized hardware like GPUs or FPGAs? Until we have such engines, this proposal won't be very useful in practice. None of the Web engines are planning to support Wasm execution on specialized hardware to my knowledge.

Snapstromegon commented 2 years ago

I saw the branch hinting proposal, but because of the reasons I described in the writeup, I believe that hinting on a module level probably is easier to implement and also makes more architectural sense - although I am not a contributor to the bigger WASM engines, so I might be wrong here.

I also don't know if such an engine exists except for the spec incompatible wasmachine linked in #1050. I also believe that this is a chicken and egg situation, since it's not really feasible to push some code to the GPU by just "guessing" as a runtime (since no way of marking exists at the moment).

The idea started as WASI specific, but then I thought that this might also be helpful in the browser.

Although I have a hard time coming up with usecases other than specialized hardware, this proposal could also be used for marking other things to a possible runtime.

rossberg commented 2 years ago

As a general principle, all modes or attributes should actually be orthogonal to modules. That is, no properties should be attached to modules as a whole, since that would break some basic modularity principles in Wasm. For example, there are a number of compilers, tool chains, and deployment environments that depend on the ability to routinely merge modules. That would no longer work if those can have mutually exclusive properties.

Think of Wasm modules as merely bags of definitions. All semantics is in the definitions.

Snapstromegon commented 2 years ago

@rossberg This sounds interesting and is an understanding of WASM modules which I didn't have before. I still think that doing this on a module basis is still the best approach (based on the semantics currently available in WASM).

Because this proposal breaks with some principles (although I didn't know that runtimes depend on the ability to merge), it's always safe to just ignore the hints.

I bound this proposal to modules, so globals and datas are initialized in the correct location. Sadly I have no better idea, how to bind a set of functions and data so it could be extracted to run e.g. on a GPU.

rossberg commented 2 years ago

Well, you can always introduce some explicit grouping mechanism in the hints themselves if necessary.

Even if you could drop incompatible hints in modular program transformations, that seems to partially defeat their purpose. It's preferable to have a mechanism that is compatible with common modular transformations.

(although I didn't know that runtimes depend on the ability to merge)

Some non-Web environments forego the need for defining a linking and name resolution semantics by only allowing single-module applications, so that linking has to be performed off-line by merging.