brson / wasm-opt-rs

Rust bindings for Binaryen's wasm-opt
Apache License 2.0
61 stars 10 forks source link

Non-equal wasm-opt results #150

Closed dzfrias closed 1 year ago

dzfrias commented 1 year ago

First off, thanks for the great crate and API! There are a few conformance issues though... using OptimizationOptions::new_optimize_for_size_aggressively() and wasm-opt -Oz (from the main binaryen project) do not have the same outputs.

To reproduce, I wrote this program in tinygo:

package main

import "fmt"

func main() {
    fmt.Println("hello world")
}

In main.go.

I compiled it with: tinygo build -o out.wasm -target wasm main.go. Trying to optimize for size aggressively results in different file sizes.

I am on wasm-opt v114 and using the 0.114.0 version of this crate.

dzfrias commented 1 year ago

I've been doing some digging around the binaryen and this crate's source. I'm not sure if this helps, but this code is not run by wasm-opt-rs, and is part of the CLI of wasm-opt from binaryen.

https://github.com/WebAssembly/binaryen/blob/0fc7b883d373924717ceccf71a6a759c97a8fb08/src/tools/optimization-options.h#L347

dzfrias commented 1 year ago

I'm doing some experiments with the C++ layer, and it seems like even the most minimal configuration of the PassRunner does not work...

void hopefullyWorks(Module& wasm) {
  auto passRunner = wasm::PassRunner(&wasm);
  passRunner.options = wasm::PassOptions::getWithDefaultOptimizationOptions();
  passRunner.options.shrinkLevel = 2;
  passRunner.options.optimizeLevel = 2;
  passRunner.options.debugInfo = false;
  passRunner.addDefaultOptimizationPasses();
  passRunner.run();
}

This is a function I call directly from my tests. This is almost exactly the same as binaryen-rs' implementation, which works but is a bit outdated (and doesn't support the breadth of options this crate has).

With this, my next thoughts are to look experiment with the module reading functionality. This is my current testing code (from Rust):

pub fn imagine(path: impl AsRef<Path>, out: impl AsRef<Path>) {
    let mut m = Module::new();
    let mut feature_set = BaseFeatureSet::new();
    feature_set.set(BaseFeature::SignExt, true);
    feature_set.set(BaseFeature::MutableGlobals, true);
    m.apply_features(feature_set, BaseFeatureSet::new());

    {
        let mut reader = ModuleReader::new();
        reader.set_dwarf(false);
        reader.read(path.as_ref(), &mut m, None).unwrap();
    }

    hopefully_works(&mut m);

    {
        let mut writer = ModuleWriter::new();
        writer.set_debug_info(false);
        writer.write_binary(&mut m, out.as_ref()).unwrap();
    }
}

To be honest, I'm not sure what else would work anymore, as I thought the initial problem was with PassRunner. As I no longer think that's the case, I'll have to turn towards ModuleReader and ModuleWriter...

dzfrias commented 1 year ago
wasm::Module wasm;
wasm.features.enable(wasm::FeatureSet::Default);
wasm.features.disable(wasm::FeatureSet::None);

wasm::ModuleReader reader;
reader.setDWARF(false);
reader.setProfile(wasm::IRProfile::Normal);
std::string inputSourceMapFilename;
reader.read(infile, wasm, inputSourceMapFilename);

auto passRunner = wasm::PassRunner(&wasm, wasm::PassOptions::getWithDefaultOptimizationOptions());
passRunner.options.shrinkLevel = 2;
passRunner.options.optimizeLevel = 2;
passRunner.options.debugInfo = false;
passRunner.addDefaultOptimizationPasses();
passRunner.run();

wasm::ModuleWriter writer;
writer.setBinary(true);
writer.setDebugInfo(false);
writer.write(wasm, outfile);

I got this example code to work when I put it in the binaryen codebase, not in the shims.h file in this repository. I'm not quite sure what the difference is... is it possible that the preprocessing work in the build.rs script is messing something up?

dzfrias commented 1 year ago

After some more searching for the root cause of the issue, I believe it lies in the build.rs for wasm-opt-sys. binaryen's CMakeLists.txt defines BUILD_LLVM_DWARF, enables code that modifies some DWARF info that a binary may have at Module read time. Since tinygo builds with DWARF info automatically, the relevant custom sections were present in the output binary, and thus were modified by wasm-opt. And as the custom build setup defined by this repository does not set BUILD_LLVM_DWARF, nothing happens for wasm-opt-rs.

I'll make a PR to fix this sometime soon, but I think build-based edge cases like this in general can be prevented in the future by watching the macro definitions propagated by the upstream binaryen CMake configuration.

Lastly, I'm currently investigating if the DWARF modifications of interest are actually intended by the main wasm-opt. They're quite extreme, as they strip all debug line info from the input wasm file. This doesn't seem to be the intended behavior based on the binaryen PR that initially added the code that messes with DWARF-generated custom sections, but I could definitely be mistaken. I'll keep investigating that, and maybe a fix will happen upstream that this repository should pull.

dzfrias commented 1 year ago

I'll close this issue after the merging of #151!