emscripten-core / emscripten

Emscripten: An LLVM-to-WebAssembly Compiler
Other
25.63k stars 3.28k forks source link

Migrating library from Emscripten 1.38 to 2.x or 3.x causes significant performance degradation #18517

Open nrabinowitz opened 1 year ago

nrabinowitz commented 1 year ago

I maintain the h3-js library, which uses Emscripten to transpile the H3 core library to a standalone Javascript package. Our current toolchain uses Emscripten 1.38, via the Docker container trzeci/emscripten:sdk-tag-1.38.43-64bit (now deprecated). We are using pure JS output, not WASM.

We would like to upgrade to the latest Emscripten version, and I tried this with a 2.x version but noticed performance degradation. I recently tried again with version 3.1.30, but saw the same issues. Some of the library's functions ran with comparable speed to the current version, but many, especially our core functions, saw a 10x slowdown (benchmarks below use Benchmark.js on Node 12):

Emscripten 1.38.43

latLngToCell x 458,094 ops/sec ±1.17% (90 runs sampled)
cellToLatLng x 1,200,561 ops/sec ±0.60% (87 runs sampled)

Emscripten 3.1.30

latLngToCell x 36,071 ops/sec ±0.61% (89 runs sampled)
cellToLatLng x 101,367 ops/sec ±1.29% (92 runs sampled)

Even when the library is converted to WASM, instead of JS, the slowdown is comparable. The functions in question are fairly mathematically intensive, using trig functions over 64-bit doubles.

Do you have any ideas or suggestions to fix the performance issue here? I'm hoping there's just a compilation flag I'm missing, but it's also possible that something changed in Emscripten that requires more attention. Thanks for your help.

Related discussion: https://github.com/uber/h3-js/issues/163

4nthonylin commented 1 year ago

Adding in some benchmarks for wasm!

General flags used to compile libh3.js

emcc -O3 -I ../include *.c -DH3_HAVE_VLA --memory-init-file 0 \
    --profiling \
    -s WASM=0 \
    -s WASM_ASYNC_COMPILATION=0 \
    -s INVOKE_RUN=0 \
    -s MODULARIZE=1 \
    -s EXPORT_NAME="'libh3'" \
    -s FILESYSTEM=0 \
    -s NODEJS_CATCH_EXIT=0 \
    -s NODEJS_CATCH_REJECTION=0 \
    -s TOTAL_MEMORY=33554432 \
    -s ALLOW_MEMORY_GROWTH=1 \
    -s WARN_UNALIGNED=1 \
    -s EXPORTED_FUNCTIONS=$bound_functions \
    -s EXTRA_EXPORTED_RUNTIME_METHODS='["cwrap", "getValue", "setValue"]' \
    "$@"

Emscripten 3.1.30 WASM=0

latLngToCell x 54,759 ops/sec ±0.54% (88 runs sampled)

Emscripten 3.1.30 WASM=1

latLngToCell x 210,094 ops/sec ±0.49% (91 runs sampled)

Emscripten 1.38.43 WASM=0

latLngToCell x 756,821 ops/sec ±0.47% (88 runs sampled)

As @nrabinowitz mentioned, there are significant performance regressions for h3-js latLngToCell related workflows. WASM=1 does result in a ~3.8x speedup however the old version compiled by emscripten:1.38.43 is still 3.6x faster than the WASM bundle.

Attached below is a stack trace with timing for emscripten:3.1.30 WASM=0 image

And emscripten:1.38.43 image

Note the stacktraces aren't apples to apples since benchmark.js does some statistical analysis. Just to give a rough outline of what is taking a long time.

kripken commented 1 year ago

WASM=0 does make sense. In older versions asm.js was emitted which could be a lot faster than normal JS, which is what modern wasm2js emits.

Any significant slowdown with WASM=1 is very surprising, however. If there's nothing obvious in those stack traces, you can bisect to look if this is caused by a specific change.