WebAssembly / WASI

WebAssembly System Interface
Other
4.75k stars 243 forks source link

Remove the `_initialize` function from the definition of reactors. #497

Closed sunfishcode closed 1 year ago

sunfishcode commented 1 year ago

The _initialize function was added to the concept of reactors in order to implement static initializers, such as in C++. It requires the outside environment to call the _initialize function before calling any other export.

However, this differs from how WebAssembly modules are used on the Web and other embedders outside of WASI. WebAssembly embedders in other environments are accustomed to just instantiating WebAssembly modules and accessing their exports.

One option would be to teach more environments how to load WASI reactors, specifically, to have them recognize and call the _initialize function autmatically. But at a high level, I think this would move us in the wrong direction. This would mean that more things need to treat WASI modules differently from regular WebAssembly modules.

Overall, I think it's better to work toward making WASI modules just be regular WebAssembly modules that just happen to import WASI APIs instead of other APIs.

I've now posted a patch which implements static initializers without using an _initialize function.

sbc100 commented 1 year ago

I'm not sure what you mean by "regular" wasm module here. Emscripten-built modules are very common (maybe the most common) and they currently require the calling _initialize or _start before calling any other exports.

If a most has a lot exports maybe the overhead of adding this check to all of then could add up, but in terms of code size and performance? Perhaps if we add this feature to the linker there should be a way to opt in or out of it?

devsnek commented 1 year ago

one downside here is that the browser js api forces start to be called during instantiate, meaning you can't have any cycles. _initialize was a nice way to work around that. not that i think wasi needs to fix this browser js thing, just mentioning it.

sunfishcode commented 1 year ago

@sbc100

I'm not sure what you mean by "regular" wasm module here. Emscripten-built modules are very common (maybe the most common) and they currently require the calling _initialize or _start before calling any other exports.

Yes, I was imprecise there. I mean to say wasm32-unknown-unknown modules. Emscripten-built modules typically are expected to be used with Emscripten's generated JS code. But many users doing wasm32-unknown-unknown expect to use their own JS glue, or their own embedding.

If a most has a lot exports maybe the overhead of adding this check to all of then could add up, but in terms of code size and performance? Perhaps if we add this feature to the linker there should be a way to opt in or out of it?

That's true. I'm still thinking about what makes sense here.

@devsnek Do you know of users using _initialize for purposes other than C/C++ constructors?

titzer commented 1 year ago

Overall, I think it's better to work toward making WASI modules just be regular WebAssembly modules that just happen to import WASI APIs instead of other APIs.

I generally agree with this sentiment. The fewer WASI-specific conventions around how modules are used (such as this issue), the greater the interoperability, particularly if WASI is primarily an API. If we find we need new hooks in the lifetime of a module, I think it's best if we can reuse or if necessary, introduce minimal core Wasm features to accomplish them, as other platforms may end up needing similar things.

kripken commented 1 year ago

@devsnek

I think that's a serious issue, unless I'm missing something. That is, if a static initializer needs to access an import and that import calls an export (edit: or accesses an exported memory or table), then _initialize is the workaround to enable that. Would such reentrant static initializers simply be disallowed in WASI? (In emscripten at least they do exist today, but I'm not sure about WASI)

sbc100 commented 1 year ago

@devsnek @kripken, this issue, IIUC, has nothing to do with the use of the wasm start section. This proposal is to move from an explicit _initialize export to be automatic "initialize on first call" technique where each export is wrapped in a conditional that checks if initialization has happened yet (the linker automatically generates these wrappers). Neither of these techniques uses the wasm start section (for the reasons you mention).

devsnek commented 1 year ago

Ah i suppose i should have read the llvm patch, thanks for clarifying @sbc100. This sounds good!

sunfishcode commented 1 year ago

@sbc100 The way the current patch works, if the code being linked contains a call to __wasm_call_ctors, the linker doesn't add any wrappers. So one way to opt out is to call this function yourself. wasm-ld will interpret that as you taking responsibility for making sure that it gets called at the appropriate time.

For some users, another way to opt out would be to avoid using static constructors. They're already discouraged in many places already, due to startup time concerns.

sunfishcode commented 1 year ago

I need to do more work to figure out if this proposal is still needed, so closing this PR for now.