How to reproduce the performance claims at the wasmer vs wasmtime page?

bjorn3 commented 2 years ago

https://wasmer.io/wasmer-vs-wasmtime lists a couple of claims about why wasmer is better than wasmtime. The "Flexible compiler support" and "Favorite language integration" claims are easy to verify as true. However for the "Startup speed" and "Execution speed" are not verifiable without pointing to the benchmarks that gave those result. The "Execution speed" claim I can believe when comparing LLVM for Wasmer with Cranelift for Wasmtime. However the "Startup speed" claim claims such a huge perf difference that I want to see the benchmark on which it was based to verify this for myself. I have a feeling like it is an apples vs pears comparison. My best guess is that it compares pre-compiled object files for Wasmer with just in time compiled code for Wasmtime. This would not be a fair comparison given that Wasmtime also supports just in time compiled code.

bjorn3 commented 2 years ago

By the way when trying to verify the "Lightweight headless mode" claim, I'm getting 4.8MB (2.8MB with thin LTO) for the engine-headless example adapted to skip the compilation step and then compiling with cargo build --no-default-features --example engine-headless --release --features "wasmer/sys wasmer-workspace/wasmer-engine-dylib".

Patch to wasmer on top of version 2.3.0

```diff diff --git a/Cargo.toml b/Cargo.toml index fa5faed61..79f3a8626 100644 --- a/Cargo.toml +++ b/Cargo.toml @@ -184,7 +184,7 @@ required-features = ["cranelift"] [[example]] name = "engine-headless" path = "examples/engine_headless.rs" -required-features = ["cranelift"] +required-features = [] [[example]] name = "platform-headless-ios" diff --git a/examples/engine_headless.rs b/examples/engine_headless.rs index 41f8b3fcd..cf8e3013a 100644 --- a/examples/engine_headless.rs +++ b/examples/engine_headless.rs @@ -47,65 +47,14 @@ use tempfile::NamedTempFile; use wasmer::imports; -use wasmer::wat2wasm; use wasmer::Instance; use wasmer::Module; use wasmer::Store; use wasmer::Value; -use wasmer_compiler_cranelift::Cranelift; use wasmer_engine_dylib::Dylib; fn main() -> Result<(), Box> { - // First step, let's compile the Wasm module and serialize it. - // Note: we need a compiler here. - let serialized_module_file = { - // Let's declare the Wasm module with the text representation. - let wasm_bytes = wat2wasm( - r#" -(module - (type $sum_t (func (param i32 i32) (result i32))) - (func $sum_f (type $sum_t) (param $x i32) (param $y i32) (result i32) - local.get $x - local.get $y - i32.add) - (export "sum" (func $sum_f))) -"# - .as_bytes(), - )?; - - // Define a compiler configuration. - // - // In this situation, the compiler is - // `wasmer_compiler_cranelift`. The compiler is responsible to - // compile the Wasm module into executable code. - let compiler_config = Cranelift::default(); - - println!("Creating Dylib engine..."); - // Define the engine that will drive everything. - // - // In this case, the engine is `wasmer_engine_dylib` which - // means that a shared object is going to be generated. So - // when we are going to serialize the compiled Wasm module, we - // are going to store it in a file with the `.so` extension - // for example (or `.dylib`, or `.dll` depending of the - // platform). - let engine = Dylib::new(compiler_config).engine(); - - // Create a store, that holds the engine. - let store = Store::new(&engine); - - println!("Compiling module..."); - // Let's compile the Wasm module. - let module = Module::new(&store, wasm_bytes)?; - - println!("Serializing module..."); - // Here we go. Let's serialize the compiled Wasm module in a - // file. - let serialized_module_file = NamedTempFile::new()?; - module.serialize_to_file(&serialized_module_file)?; - - serialized_module_file - }; + let serialized_module_file = NamedTempFile::new()?; // Second step, deserialize the compiled Wasm module, and execute // it, for example with Wasmer without a compiler. ```

I'm getting 5.2MB (3.4MB with LTO) for the serialize example adapted to skip the compilation step and then compiling with cargo run --example serialize --no-default-features --release.

Patch to wasmtime on top of version 0.38.0

```diff diff --git a/Cargo.toml b/Cargo.toml index 3525d8e79..e380dc556 100644 --- a/Cargo.toml +++ b/Cargo.toml @@ -21,46 +21,15 @@ path = "src/bin/wasmtime.rs" doc = false [dependencies] -wasmtime = { path = "crates/wasmtime", version = "0.38.0", default-features = false, features = ['cache', 'cranelift'] } -wasmtime-cache = { path = "crates/cache", version = "=0.38.0" } -wasmtime-cli-flags = { path = "crates/cli-flags", version = "=0.38.0" } -wasmtime-cranelift = { path = "crates/cranelift", version = "=0.38.0" } -wasmtime-environ = { path = "crates/environ", version = "=0.38.0" } -wasmtime-wast = { path = "crates/wast", version = "=0.38.0" } -wasmtime-wasi = { path = "crates/wasi", version = "0.38.0" } -wasmtime-wasi-crypto = { path = "crates/wasi-crypto", version = "0.38.0", optional = true } -wasmtime-wasi-nn = { path = "crates/wasi-nn", version = "0.38.0", optional = true } -clap = { version = "3.1.12", features = ["color", "suggestions", "derive"] } +wasmtime = { path = "crates/wasmtime", version = "0.38.0", default-features = false } anyhow = "1.0.19" -target-lexicon = { version = "0.12.0", default-features = false } -libc = "0.2.60" -humantime = "2.0.0" -lazy_static = "1.4.0" -listenfd = "1.0.0" [target.'cfg(unix)'.dependencies] rustix = "0.33.7" [dev-dependencies] # depend again on wasmtime to activate its default features for tests -wasmtime = { path = "crates/wasmtime", version = "0.38.0" } -env_logger = "0.9.0" -filecheck = "0.5.0" -more-asserts = "0.2.1" -tempfile = "3.1.0" -test-programs = { path = "crates/test-programs" } -wasmtime-runtime = { path = "crates/runtime" } -tokio = { version = "1.8.0", features = ["rt", "time", "macros", "rt-multi-thread"] } -tracing-subscriber = "0.3.1" -wast = "41.0.0" -criterion = "0.3.4" -num_cpus = "1.13.0" -winapi = { version = "0.3.9", features = ['memoryapi'] } -memchr = "2.4" -async-trait = "0.1" -wat = "1.0.42" -once_cell = "1.9.0" -rayon = "1.5.0" +wasmtime = { path = "crates/wasmtime", version = "0.38.0", default-features = false } [build-dependencies] anyhow = "1.0.19" @@ -90,28 +59,6 @@ exclude = [ ] [features] -default = [ - "jitdump", - "wasmtime/wat", - "wasmtime/parallel-compilation", - "vtune", - "wasi-nn", - "pooling-allocator", - "memory-init-cow", -] -jitdump = ["wasmtime/jitdump"] -vtune = ["wasmtime/vtune"] -wasi-crypto = ["wasmtime-wasi-crypto"] -wasi-nn = ["wasmtime-wasi-nn"] -memory-init-cow = ["wasmtime/memory-init-cow", "wasmtime-cli-flags/memory-init-cow"] -pooling-allocator = ["wasmtime/pooling-allocator", "wasmtime-cli-flags/pooling-allocator"] -all-arch = ["wasmtime/all-arch"] -posix-signals-on-macos = ["wasmtime/posix-signals-on-macos"] -component-model = ["wasmtime/component-model", "wasmtime-wast/component-model", "wasmtime-cli-flags/component-model"] - -# Stub feature that does nothing, for Cargo-features compatibility: the new -# backend is the default now. -experimental_x64 = [] [badges] maintenance = { status = "actively-developed" } diff --git a/examples/serialize.rs b/examples/serialize.rs index 22a281698..0531264ea 100644 --- a/examples/serialize.rs +++ b/examples/serialize.rs @@ -6,22 +6,6 @@ use anyhow::Result; use wasmtime::*; -fn serialize() -> Result> { - // Configure the initial compilation environment, creating the global - // `Store` structure. Note that you can also tweak configuration settings - // with a `Config` and an `Engine` if desired. - println!("Initializing..."); - let engine = Engine::default(); - - // Compile the wasm binary into an in-memory instance of a `Module`. - println!("Compiling module..."); - let module = Module::from_file(&engine, "examples/hello.wat")?; - let serialized = module.serialize()?; - - println!("Serialized."); - Ok(serialized) -} - fn deserialize(buffer: &[u8]) -> Result<()> { // Configure the initial compilation environment, creating the global // `Store` structure. Note that you can also tweak configuration settings @@ -64,6 +48,5 @@ fn deserialize(buffer: &[u8]) -> Result<()> { } fn main() -> Result<()> { - let file = serialize()?; - deserialize(&file) + deserialize(&[]) } diff --git a/src/lib.rs b/src/lib.rs index 39c3c360b..e69de29bb 100644 --- a/src/lib.rs +++ b/src/lib.rs @@ -1,58 +0,0 @@ -//! The Wasmtime command line interface (CLI) crate. -//! -//! This crate implements the Wasmtime command line tools. - -#![deny( - missing_docs, - trivial_numeric_casts, - unused_extern_crates, - unstable_features -)] -#![warn(unused_import_braces)] -#![cfg_attr(feature = "clippy", plugin(clippy(conf_file = "../clippy.toml")))] -#![cfg_attr(feature = "cargo-clippy", allow(clippy::new_without_default))] -#![cfg_attr( - feature = "cargo-clippy", - warn( - clippy::float_arithmetic, - clippy::mut_mut, - clippy::nonminimal_bool, - clippy::map_unwrap_or, - clippy::unicode_not_nfc, - clippy::use_self - ) -)] - -use wasmtime_cli_flags::{SUPPORTED_WASI_MODULES, SUPPORTED_WASM_FEATURES}; - -lazy_static::lazy_static! { - static ref FLAG_EXPLANATIONS: String = { - use std::fmt::Write; - - let mut s = String::new(); - - // Explain --wasm-features. - writeln!(&mut s, "Supported values for `--wasm-features`:").unwrap(); - writeln!(&mut s).unwrap(); - let max = SUPPORTED_WASM_FEATURES.iter().max_by_key(|(name, _)| name.len()).unwrap(); - for (name, desc) in SUPPORTED_WASM_FEATURES.iter() { - writeln!(&mut s, "{:width$} {}", name, desc, width = max.0.len() + 2).unwrap(); - } - writeln!(&mut s).unwrap(); - - // Explain --wasi-modules. - writeln!(&mut s, "Supported values for `--wasi-modules`:").unwrap(); - writeln!(&mut s).unwrap(); - let max = SUPPORTED_WASI_MODULES.iter().max_by_key(|(name, _)| name.len()).unwrap(); - for (name, desc) in SUPPORTED_WASI_MODULES.iter() { - writeln!(&mut s, "{:width$} {}", name, desc, width = max.0.len() + 2).unwrap(); - } - - writeln!(&mut s).unwrap(); - writeln!(&mut s, "Features prefixed with '-' will be disabled.").unwrap(); - - s - }; -} - -pub mod commands; ```

While Wasmer produces smaller headless binaries, the difference is significantly smaller than the comparison page claims. I suspect that it has never been updated for the added headless support of Wasmtime. I also noticed that the headless binary I produced using Wasmer was almost 3x as big as the claimed size. I'm not sure why this is the case.

(Note: Everything was compiled using the same rustc version (rustc 1.61.0 (fe5b13d68 2022-05-18))

syrusakbary commented 2 years ago

Thanks for opening the issue!

The startup speed was measured with Wasmtime 0.2X I believe (it was about a year and a half ago so unfortunately I don't really remember the exact version used).

My best guess is that it compares pre-compiled object files for Wasmer with just in time compiled code for Wasmtime. This would not be a fair comparison given that Wasmtime also supports just in time compiled code.

Indeed, that would not be fair. To give you more context, let me confirm that we did compared just startup speed, in an apples-to-apples comparison. In general, both wasmer and wasmtime cached the compiled objects at the time of testing (which I assume is stills holds true today).

We measured something similar to the following (note that each command was run twice, to let the runtime cache the artifact and not need to recompile it again):

wasmer run xyz.wasm --llvm --native # today it would have been --llvm --dylib
wasmtime run xyz.wasm

In the case of wasmer, the startup difference was mainly caused by using the native dlopen under the hood vs the custom artifact format that wasmtime used (similar strategy that Lucet used to do, and why Lucet was way faster as starting vs wasmtime, up to par with Wasmer). wasmtime's strategy was way slower to load at the time of measurement (not sure about the latest one, since we haven't measured it again).

By the way when trying to verify the "Lightweight headless mode" claim

About the headless mode. Wasmer headless with just the native/dylib engine was just 800Kb at the time of measurement. Right now, the "non-optimized version" of Wasmer headless is 1.6Mb (you can download it from https://github.com/wasmerio/wasmer/releases/download/2.3.0/wasmer-linux-amd64.tar.gz in bin/wasmer-headless, or by running make build-wasmer-headless-minimal in the makefile). Note that the file size difference (1.6Mb vs 800Kb) is due to having to include the custom "universal" engine. Once that engine is not included, sizes should be in the order of Kbs (~800Kb).

For Wasmtime's, they didn't provide any headless binary nor they supported the --no-default-features that you used today, so we just simply used what was available at the time.

Hope this clarifies your questions! Closing the ticket :)

bjorn3 commented 2 years ago

Thanks for the reply.

Indeed, that would not be fair. To give you more context, let me confirm that we did compared just startup speed, in an apples-to-apples comparison.

:+1:

We measured something similar to the following (note that each command was run twice, to let the runtime cache the artifact and not need to recompile it again):

Was this a small or a big wasm module? And do you happen to remember which wasm module was used exactly? I want to try and see how much wasmtime has improved since.

For Wasmtime's, they didn't provide any headless binary nor they supported the --no-default-features that you used today, so we just simply used what was available at the time.

Makes sense.

Would you accept a PR updating the results at https://wasmer.io/wasmer-vs-wasmtime for the latest Wasmer and Wasmtime versions? I will mention the exact benchmarks I have used in that case.

tqwewe commented 1 year ago

I'd love to see the page updated with benchmarks used and more information. Sticking 2x speed and 1000x startup without any context is a little vague and difficult to believe.

wasmerio / wasmer

How to reproduce the performance claims at the wasmer vs wasmtime page? #2994