WebAssembly / binaryen

Optimizer and compiler/toolchain library for WebAssembly
Apache License 2.0
7.29k stars 717 forks source link

A question about BINARYEN_EXTRA_PASSES with --no-validation #4167

Open andrewevstyukhin opened 2 years ago

andrewevstyukhin commented 2 years ago

Hi there!

I successfully use linker option -s BINARYEN_EXTRA_PASSES=--no-validation to greatly speed up linking in CI.

[PassRunner] running passes [PassRunner] running pass: generate-i64-dyncalls... 0.177513 seconds. [PassRunner] (validating) [PassRunner] running pass: legalize-js-interface... 0.0777418 seconds. [PassRunner] (validating) [PassRunner] running pass: strip-target-features... 0.0230794 seconds. [PassRunner] (validating) [PassRunner] passes took 0.278334 seconds. [PassRunner] (final validation) [PassRunner] running passes [PassRunner] running pass: strip-dwarf... 0.0131788 seconds. [PassRunner] running pass: post-emscripten... 0.48432 seconds. [PassRunner] running pass: duplicate-function-elimination... 4.06135 seconds. [PassRunner] running pass: memory-packing... 0.0158605 seconds. [PassRunner] running pass: once-reduction... 0.130406 seconds. [PassRunner] running pass: ssa-nomerge... 1.68847 seconds. [PassRunner] running pass: dce... 0.79336 seconds.

Is it possible to analyze presence of no-validation in finalize_wasm function from upstream\emscripten\emscripten.py too?

Example of proof:

if settings.DEBUG_LEVEL >= 3:   args.append('--dwarf') args.append('--no-validation') # my trick stdout = building.run_binaryen_command('wasm-emscripten-finalize',

with output:

[100%] Linking CXX executable Example.html [PassRunner] running passes [PassRunner] running pass: generate-i64-dyncalls... 0.177469 seconds. [PassRunner] running pass: legalize-js-interface... 0.066068 seconds. [PassRunner] running pass: strip-target-features... 0.0122775 seconds. [PassRunner] passes took 0.255815 seconds. [PassRunner] running passes

Anyway we always perform validation in this required step: %EMSDK%/upstream/bin/wasm-opt --strip-debug Example.wasm -o Example.wasm

kripken commented 2 years ago

Hmm, that output

[PassRunner] running passes

suggests that you have BINARYEN_PASS_DEBUG set in the env. Is that intentional? It will slow down linking quite a lot.

That env var + validation is indeed very very slow, as it validates after each pass, and does extra "heavy" validation too. But if you remove that env var, the time spent in validation should be barely noticeable. I think for that reason this hasn't come up before.

andrewevstyukhin commented 2 years ago

Without BINARYEN_PASS_DEBUG=1 I can't measure anything. Always enabled validate brings a bunch of doubts:

struct PassOptions { // Run passes in debug mode, doing extra validation and timing checks. bool debug = false; // Whether to run the validator to check for errors. bool validate = true; // When validating validate globally and not just locally bool validateGlobally = false;

// BINARYEN_PASS_DEBUG is a convenient commandline way to log out the toplevel // passes, their times, and validate between each pass. // (we don't recurse pass debug into sub-passes, as it // doesn't help anyhow and also is bad for e.g. printing // which is a pass) // this method returns whether we are in passDebug mode, and which value: // 1: log out each pass that we run, and validate in between (can pass // --no-validation to skip validation). // 2: like 1, and also save the last pass's output, so if breakage happens we // can print a useful error. also logs out names of nested passes. // 3: like 1, and also dumps out byn-* files for each pass as it is run.

Clang has an excellent option -fproc-stat-report. Optimization is generally impossible without measurement. And measurement which significantly changes behaviour is pointless. Metrics for time and memory are very useful to debug CI crashes caused by out of memory conditions. Widely used cloud servers have many logical CPUs with relatively limited per thread memory.

kripken commented 2 years ago

Oh, I'm not sure I follow you then. What are you measuring - individual binaryen passes, in order to optimize one of them? Then pass-debug is indeed the way to do so. Or, if you want to measure the entire binaryen execution, EMCC_DEBUG=1 would do that much more efficiently for example.

andrewevstyukhin commented 2 years ago

BINARYEN_PASS_DEBUG=1 can show precise location of the crush caused by out of memory condition on CI. Currently I need a wasm-emscripten-finalize without validation overhead. imho EMCC_DEBUG=1 is too heavy (costly) for in production usage.

kripken commented 2 years ago

I see, thanks @andrewevstyukhin

There isn't a way to disable validation in wasm-emscripten-finalize atm. I think that would require adding a new flag in emscripten for it. I think that would make sense, basically a flag that gets passed to all binaryen tools, and is applied in building.run_binaryen_command probably. A PR would be welcome.

andrewevstyukhin commented 2 years ago

Thanks! building.run_binaryen_command is a good place to add new common parameter, it is a first function where I did changes. Do you agree with parameter naming --no-validation for the shell?

kripken commented 2 years ago

It looks like wasm-emscripten-finalize already has a --no-validation parameter that it accepts, so I don't think we need to add anything for that.

sbc100 commented 2 years ago

How about instead of trying to extent the meaning of emscripten's BINARYEN_EXTRA_PASSES setting (which is designed to control which opts are run) we instead extend BINARYEN_PASS_DEBUG to be more like what you want? How about BINARYEN_PASS_DEBUG=fast or BINARYEN_PASS_DEBUG=summary? Then no emscripten changes would be needed at all..

andrewevstyukhin commented 2 years ago

I think many solutions exist. BINARYEN_PASS_DEBUG=summary sounds good, but when it disables default setting bool validate = true; things stay weird.

sbc100 commented 2 years ago

My understanding is that you don't actually want to disable all validation.. you just want to have BINARYEN_PASS_DEBUG do the same amount of validation as you get under normal circumstances such that BINARYEN_PASS_DEBUG doesn't slow down the executions due to extra validation? Is that right? Another way of putting it: don't think you want to actually disable the validation that happens under normal circumstances (without BINARYEN_PASS_DEBUG).. I think the only reason to do that would be if you actually had an invalid module.

In that case I think an alternative/configurable version of BINARYEN_PASS_DEBUG seems like the way to go.

andrewevstyukhin commented 2 years ago

I prefer none validation. Finally I do such invocation: $EMSDK/upstream/bin/wasm-opt --strip-debug Example.wasm -o Example.wasm with full validation (BINARYEN_PASS_DEBUG=1). And I don't want to waste time in intermediate steps.

sbc100 commented 2 years ago

If you really want to disable validation in binaryen I don't think BINARYEN_EXTRA_PASSES is the way to do it. I would suggest a completely new option for that.

However, I'm hoping that there would be no need to do that if what @kripken says is true: "if you remove that env var, the time spent in validation should be barely noticeable."

andrewevstyukhin commented 2 years ago

I think new shell option aka --no-validation can be very helpful. Current solution is not ideal. If you print validation timings then I can do some statements about noticeability.