Open dschuff opened 2 weeks ago
What might be happening is that wasm-split
is run late, and we have already stripped or not stripped the section, and so it notices the difference. But I think we should make it ignore that change. Emscripten detects the features early on, before it optimizes or gets to any stripping, and then uses those features through the pipeline - probably it should do the same with how it calls wasm-split
? That is, it should tell that tool what features to use based on what it knows, rather than the tool detecting the features from the section.
That could make sense; it would require wasm-split (and Binaryen more generally) to accept a new form of argument. Currently there are the generic flags such as -all
and --enable-reference-types
that IIUC only affect the validator, and everything else goes based on features.has
. We'd need something like --use-reference-types
, or just to extend the meaning of --enable-reference-types
to basically populate/override features.has
.
It would be simpler for emscripten to just not strip the target_features section in that case (and then have split-module itself do that) but it does seem more generic and maybe robust to unify the feature enablement flags with the features section.
Separately I'm also looking at fixing wasm-split's multi-table output to work with emscripten's JS, since we might as well take advantage of multiple tables.
The enabled features should always be the full set of features that the module is allowed to use, whether or not it actually does. In other words, --enable-reference-types
and the other feature flags already populate features
on the module.
As @kripken mentioned, Emscripten reads the target features section and turns those into a set of feature flags to pass to Binaryen, so in theory the target features section should already be in sync with the feature flags, whether or not the target features section is stripped.
I discovered a couple of things while working on some unrelated stuff (enabling bulk memory and nontrapping-fp by default). While doing that I wrote #7043 to make testing on the emscripten side easier and to make the current default behavior consistent with what will be the default when we enable bigint by default (and thus stop running wasm-emscripten-finalize by default). Namely, linked binaries will contain target_feature sections by default.
I discovered that the presence of that section affects the behavior of module splitting (and probably beyond):
features.hasReferenceTypes()
and friends now returns true, since LLVM is now enabling reference types. This somewhat weird by itself, since the presence of the section is what controls this, and not whether e.g. there are actually multiple tables (orexnref
etc) or flag/configuration, and because emscripten has historically stripped that section most (but not all!) of the time. Probably we should do something about that, but I'm not 100% sure what; maybe if the feature section is missing, we could set those feature flags when we encounter module properties that imply them? or maybe fall back to which features are enabled on the command line? maybe that will depend on the particular use offeatures.hasFoo()
, I'm not sure how pervasive those uses are but if they are common it seems like we should maybe be more robust.But more to the immediate point, #7043 also caused the roll into emscripten to fail because emscripten's use of module splitting doesn't seem to support multiple tables (there is a JS loading failure I haven't exactly diagnosed yet). I think this is also going to block enabling the features by default, so we will either need to fix emscripten's loading JS, or make a way for emscripten to disable multiple tables for now, so we can separate the problem of feature enablement from updating for multiple tables.