microsoft / onnxruntime

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
https://onnxruntime.ai
MIT License
14.11k stars 2.84k forks source link

[Build] [WebGPU] Failed to build without "--enable_wasm_threads" #21748

Closed gyagp closed 1 month ago

gyagp commented 1 month ago

Describe the issue

The latest code fails to build WebGPU without "--enable_wasm_threads". But if we add "--enable_wasm_threads" (Whole command becomes "build.bat --config Release --build_wasm --skip_tests --parallel --skip_submodule_sync --disable_wasm_exception_catching --use_jsep --use_webnn --target onnxruntime_webassembly --enable_wasm_simd --enable_wasm_threads"), the build is successful.

Urgency

This should impact the release.

Target platform

Windows

Build script

build.bat --config Release --build_wasm --skip_tests --parallel --skip_submodule_sync --disable_wasm_exception_catching --use_jsep --use_webnn --target onnxruntime_webassembly --enable_wasm_simd

Error / output

FAILED: CMakeFiles/onnxruntimemlas.dir/D/workspace/project/tmp/onnxruntime/onnxruntime/core/mlas/lib/q4_dq.cpp.o D:\workspace\project\tmp\onnxruntime\cmake\external\emsdk\upstream\emscripten\em++.bat -DBUILD_MLAS_NO_ONNXRUNTIME -DEIGEN_MPL2_ONLY -DEIGEN_USE_THREADS -DNSYNC_ATOMIC_CPP11 -DORT_ENABLE_STREAM -DPLATFORM_POSIX -DUSE_JSEP=1 -DUSE_WEBNN=1 -ID:/workspace/project/tmp/onnxruntime/build/Windows/Release/_deps/utf8_range-src -ID:/workspace/project/tmp/onnxruntime/include/onnxruntime -ID:/workspace/project/tmp/onnxruntime/include/onnxruntime/core/session -ID:/workspace/project/tmp/onnxruntime/build/Windows/Release/_deps/google_nsync-src/public -ID:/workspace/project/tmp/onnxruntime/build/Windows/Release -ID:/workspace/project/tmp/onnxruntime/onnxruntime -ID:/workspace/project/tmp/onnxruntime/build/Windows/Release/_deps/abseil_cpp-src -ID:/workspace/project/tmp/onnxruntime/onnxruntime/core/mlas/inc -ID:/workspace/project/tmp/onnxruntime/onnxruntime/core/mlas/lib -ID:/workspace/project/tmp/onnxruntime/build/Windows/Release/_deps/gsl-src/include -ffunction-sections -fdata-sections -flto -msimd128 -O3 -DNDEBUG -std=gnu++17 -fPIC -Wall -Wextra -Wno-unused-parameter -Wno-deprecated-copy -Wno-tautological-pointer-compare -Wno-ambiguous-reversed-operator -Wno-deprecated-anon-enum-enum-conversion -Wno-undefined-var-template -Wno-deprecated-builtins -Wshorten-64-to-32 -Werror -MD -MT CMakeFiles/onnxruntimemlas.dir/D/workspace/project/tmp/onnxruntime/onnxruntime/core/mlas/lib/q4_dq.cpp.o -MF CMakeFiles\onnxruntimemlas.dir\D\workspace\project\tmp\onnxruntime\onnxruntime\core\mlas\lib\q4_dq.cpp.o.d -o CMakeFiles/onnxruntimemlas.dir/D/workspace/project/tmp/onnxruntime/onnxruntime/core/mlas/lib/q4_dq.cpp.o -c D:/workspace/project/tmp/onnxruntime/onnxruntime/core/mlas/lib/q4_dq.cpp D:/workspace/project/tmp/onnxruntime/onnxruntime/core/mlas/lib/q4_dq.cpp:751:9: error: use of undeclared identifier 'ORT_ENFORCE' 751 | ORT_ENFORCE(zero_points || signed_quant, "Unsigned quant with no zero points is not supported."); | ^ D:/workspace/project/tmp/onnxruntime/onnxruntime/core/mlas/lib/q4_dq.cpp:798:9: error: use of undeclared identifier 'ORT_ENFORCE' 798 | ORT_ENFORCE( | ^ D:/workspace/project/tmp/onnxruntime/onnxruntime/core/mlas/lib/q4_dq.cpp:832:9: error: use of undeclared identifier 'ORT_ENFORCE' 832 | ORT_ENFORCE(columns % 2 == 0, "Columns must be multiple of 2."); | ^ D:/workspace/project/tmp/onnxruntime/onnxruntime/core/mlas/lib/q4_dq.cpp:846:17: error: use of undeclared identifier 'ORT_ENFORCE' 846 | ORT_ENFORCE(buffer_size == thread_blk_size, "buffer size must be equal to thread block size."); | ^ D:/workspace/project/tmp/onnxruntime/onnxruntime/core/mlas/lib/q4_dq.cpp:1099:9: error: use of undeclared identifier 'ORT_ENFORCE' 1099 | ORT_ENFORCE(columns % 2 == 0, "Columns must be multiple of 2"); | ^ D:/workspace/project/tmp/onnxruntime/onnxruntime/core/mlas/lib/q4_dq.cpp:1760:9: error: use of undeclared identifier 'ORT_THROW' 1760 | ORT_THROW("Row-wise MlasQDQQuantizeBlockwise is not implemented"); | ^ D:/workspace/project/tmp/onnxruntime/onnxruntime/core/mlas/lib/q4_dq.cpp:1812:9: error: use of undeclared identifier 'ORT_THROW' 1812 | ORT_THROW("Row-wise MlasQDQTransposeBlockwiseQuantized is not implemented"); | ^ 7 errors generated.

Visual Studio Version

No response

GCC / Compiler Version

No response

gyagp commented 1 month ago

FYI, @fs-eire @guschmue @qjia7

fs-eire commented 1 month ago

currently we no longer build WebAssembly without SIMD and Multi-thread. There are 2 benefits doing this:

What specific use case that you need to build without "--enable_wasm_threads"?

gyagp commented 1 month ago

To support WASM threads, my understanding is websites need to be cross-origin isolated (https://web.dev/coop-coep/). I think this is not true for many websites now.

fs-eire commented 1 month ago

The multi-threaded feature needs the following 2 features to work:

  1. The "Threads" WebAssembly feature. This requires the environment support, otherwise the WebAssembly file will fail to load. According to https://webassembly.org/features/ (see the "Threads" row), now all mainstream environments support this.
  2. The environment has security check (cross-origin isolated) for enabling SharedArrayBuffer. Usually we can check by typeof SharedArrayBuffer !== 'undefined'. However the actual behavior is a little bit different: even in insecure environment, by using the following code, we can always get the reference to SharedArrayBuffer:

    var SharedArrayBuffer = new WebAssembly.Memory({'initial': 0, 'maximum': 0, 'shared': true}).buffer.constructor

    (see https://github.com/microsoft/onnxruntime/blob/main/onnxruntime/wasm/pre.js#L32-L52)

    Although it is still possible to create an instance of SharedArrayBuffer, but it is not able to share correctly between threads (postMessage will throw), so the multi-thread feature actually not working.

The ONNX Runtime Web build, however, has a runtime check for this. If the environment is cross-origin isolated, it will not spawn workers even if numThreads > 1. An instance of SharedArrayBuffer is still created but it will not be shared to any workers. In this case, only feature (1) is required.

gyagp commented 1 month ago

Thanks for the details and this is a great simplification. Let me close this issue. Maybe we can later update the build script and generated file names (like ort-wasm-simd-threaded.wasm) to remove "simd" and "threaded".

csukuangfj commented 3 weeks ago

@gyagp

In case you want the SIMD-enabled version but without thread support, please see https://github.com/csukuangfj/onnxruntime-libs/releases/tag/v1.19.0

Screenshot 2024-08-19 at 15 31 41


To fix the build error you posted, you need to change https://github.com/microsoft/onnxruntime/blob/main/onnxruntime/core/mlas/lib/q4common.h

You have to add

#include "core/common/common.h"

to q4common.h.

then everything should work as expected.

gyagp commented 3 weeks ago

Thanks for the info! Could you please upload a PR to fix this?

csukuangfj commented 3 weeks ago

gyagp

Please see #21786