WebAssembly / binaryen

Optimizer and compiler/toolchain library for WebAssembly
Apache License 2.0
7.52k stars 745 forks source link

Module splitting and multiple tables #7047

Open dschuff opened 2 weeks ago

dschuff commented 2 weeks ago

I discovered a couple of things while working on some unrelated stuff (enabling bulk memory and nontrapping-fp by default). While doing that I wrote #7043 to make testing on the emscripten side easier and to make the current default behavior consistent with what will be the default when we enable bigint by default (and thus stop running wasm-emscripten-finalize by default). Namely, linked binaries will contain target_feature sections by default.

I discovered that the presence of that section affects the behavior of module splitting (and probably beyond): features.hasReferenceTypes() and friends now returns true, since LLVM is now enabling reference types. This somewhat weird by itself, since the presence of the section is what controls this, and not whether e.g. there are actually multiple tables (or exnref etc) or flag/configuration, and because emscripten has historically stripped that section most (but not all!) of the time. Probably we should do something about that, but I'm not 100% sure what; maybe if the feature section is missing, we could set those feature flags when we encounter module properties that imply them? or maybe fall back to which features are enabled on the command line? maybe that will depend on the particular use of features.hasFoo(), I'm not sure how pervasive those uses are but if they are common it seems like we should maybe be more robust.

But more to the immediate point, #7043 also caused the roll into emscripten to fail because emscripten's use of module splitting doesn't seem to support multiple tables (there is a JS loading failure I haven't exactly diagnosed yet). I think this is also going to block enabling the features by default, so we will either need to fix emscripten's loading JS, or make a way for emscripten to disable multiple tables for now, so we can separate the problem of feature enablement from updating for multiple tables.

kripken commented 2 weeks ago

What might be happening is that wasm-split is run late, and we have already stripped or not stripped the section, and so it notices the difference. But I think we should make it ignore that change. Emscripten detects the features early on, before it optimizes or gets to any stripping, and then uses those features through the pipeline - probably it should do the same with how it calls wasm-split? That is, it should tell that tool what features to use based on what it knows, rather than the tool detecting the features from the section.

dschuff commented 2 weeks ago

That could make sense; it would require wasm-split (and Binaryen more generally) to accept a new form of argument. Currently there are the generic flags such as -all and --enable-reference-types that IIUC only affect the validator, and everything else goes based on features.has. We'd need something like --use-reference-types, or just to extend the meaning of --enable-reference-types to basically populate/override features.has. It would be simpler for emscripten to just not strip the target_features section in that case (and then have split-module itself do that) but it does seem more generic and maybe robust to unify the feature enablement flags with the features section.

Separately I'm also looking at fixing wasm-split's multi-table output to work with emscripten's JS, since we might as well take advantage of multiple tables.

tlively commented 1 week ago

The enabled features should always be the full set of features that the module is allowed to use, whether or not it actually does. In other words, --enable-reference-types and the other feature flags already populate features on the module.

As @kripken mentioned, Emscripten reads the target features section and turns those into a set of feature flags to pass to Binaryen, so in theory the target features section should already be in sync with the feature flags, whether or not the target features section is stripped.