Open cmario opened 2 months ago
I have the same issue with version 1.16.3 and emsdk 3.1.44
my build cmd is :
./build.sh --config Debug --enable_wasm_simd --emsdk_version=3.1.44 --build_wasm_static_lib --enable_wasm_exception_throwing_override --enable_wasm_threads --enable_wasm_api_exception_catching --skip_tests
error :
@cmario - we have noticed the function MlasSgemmOperation
is consuming lots of stack memory. Since wasm by default allocate only 5MB for the stack, it fails there. You can try and add the following flag and see if it solves your issue. it did help us :
-s TOTAL_STACK=10MB
@YoniGBinahAi Thank you very much for your feedback, increasing the total stack to 10MB resolves the issue.
Hi @YoniGBinahAi, I noticed that your build command includes thread support. Are you actually enabling ORT's multi-threading mode? In my case, enabling multi-threading always results in the following error:
RuntimeError: table index is out of bounds
at myModule.wasm.onnxruntime::concurrency::ThreadPool::DegreeOfParallelism(onnxruntime::concurrency::ThreadPool const*) (https://localhost:8000/myModule.wasm:wasm-function[7272]:0x2a4f94)
at myModule.wasm.MlasConvPrepare(MLAS_CONV_PARAMETERS*, unsigned long, unsigned long, unsigned long, unsigned long, long long const*, long long const*, long long const*, long long const*, long long const*, long long const*, unsigned long, MLAS_ACTIVATION const*, unsigned long*, float, onnxruntime::concurrency::ThreadPool*) (https://localhost:8000/myModule.wasm:wasm-function[11360]:0x46ba65)
at myModule.wasm.onnxruntime::Conv<float>::Compute(onnxruntime::OpKernelContext*) const (https://localhost:8000/myModule.wasm:wasm-function[11350]:0x4684de)
at myModule.wasm.onnxruntime::LaunchKernelStep::Execute(onnxruntime::StreamExecutionContext&, unsigned long, onnxruntime::SessionScope&, bool const&, bool&) (https://localhost:8000/myModule.wasm:wasm-function[8572]:0x33e190)
at myModule.wasm.onnxruntime::RunSince(unsigned long, onnxruntime::StreamExecutionContext&, onnxruntime::SessionScope&, bool const&, unsigned long) (https://localhost:8000/myModule.wasm:wasm-function[8581]:0x3418af)
at myModule.wasm.onnxruntime::ExecuteThePlan(onnxruntime::SessionState const&, gsl::span<int const, 4294967295ul>, gsl::span<OrtValue const, 4294967295ul>, gsl::span<int const, 4294967295ul>, std::__2::vector<OrtValue, std::__2::allocator<OrtValue>>&, std::__2::unordered_map<unsigned long, std::__2::function<onnxruntime::common::Status (onnxruntime::TensorShape const&, OrtDevice const&, OrtValue&, bool&)>, std::__2::hash<unsigned long>, std::__2::equal_to<unsigned long>, std::__2::allocator<std::__2::pair<unsigned long const, std::__2::function<onnxruntime::common::Status (onnxruntime::TensorShape const&, OrtDevice const&, OrtValue&, bool&)>>>> const&, onnxruntime::logging::Logger const&, bool const&, bool, bool) (https://localhost:8000/myModule.wasm:wasm-function[8141]:0x2d9b2a)
at myModule.wasm.onnxruntime::utils::ExecuteGraphImpl(onnxruntime::SessionState const&, onnxruntime::FeedsFetchesManager const&, gsl::span<OrtValue const, 4294967295ul>, std::__2::vector<OrtValue, std::__2::allocator<OrtValue>>&, std::__2::unordered_map<unsigned long, std::__2::function<onnxruntime::common::Status (onnxruntime::TensorShape const&, OrtDevice const&, OrtValue&, bool&)>, std::__2::hash<unsigned long>, std::__2::equal_to<unsigned long>, std::__2::allocator<std::__2::pair<unsigned long const, std::__2::function<onnxruntime::common::Status (onnxruntime::TensorShape const&, OrtDevice const&, OrtValue&, bool&)>>>> const&, ExecutionMode, bool const&, onnxruntime::logging::Logger const&, bool, onnxruntime::Stream*) (https://localhost:8000/myModule.wasm:wasm-function[8140]:0x2d704f)
at myModule.wasm.onnxruntime::utils::ExecuteGraph(onnxruntime::SessionState const&, onnxruntime::FeedsFetchesManager&, gsl::span<OrtValue const, 4294967295ul>, std::__2::vector<OrtValue, std::__2::allocator<OrtValue>>&, ExecutionMode, OrtRunOptions const&, onnxruntime::logging::Logger const&) (https://localhost:8000/myModule.wasm:wasm-function[8145]:0x2db33c)
at myModule.wasm.onnxruntime::InferenceSession::Run(OrtRunOptions const&, gsl::span<std::__2::basic_string<char, std::__2::char_traits<char>, std::__2::allocator<char>> const, 4294967295ul>, gsl::span<OrtValue const, 4294967295ul>, gsl::span<std::__2::basic_string<char, std::__2::char_traits<char>, std::__2::allocator<char>> const, 4294967295ul>, std::__2::vector<OrtValue, std::__2::allocator<OrtValue>>*, std::__2::vector<OrtDevice, std::__2::allocator<OrtDevice>> const*) (https://localhost:8000/myModule.wasm:wasm-function[19658]:0xa5a58a)
at myModule.wasm.onnxruntime::InferenceSession::Run(OrtRunOptions const&, gsl::span<char const* const, 4294967295ul>, gsl::span<OrtValue const* const, 4294967295ul>, gsl::span<char const* const, 4294967295ul>, gsl::span<OrtValue*, 4294967295ul>) (https://localhost:8000/myModule.wasm:wasm-function[7098]:0x288eaa)
Here is how I create the session:
int num_threads = 2;
OrtThreadingOptions *tp_options;
Ort::GetApi().CreateThreadingOptions(&tp_options);
Ort::GetApi().SetGlobalIntraOpNumThreads(tp_options, num_threads);
Ort::GetApi().SetGlobalInterOpNumThreads(tp_options, 1);
Ort::GetApi().SetGlobalSpinControl(tp_options, 0);
OrtEnv *g_env;
Ort::GetApi().CreateEnvWithGlobalThreadPools(ORT_LOGGING_LEVEL_WARNING, "Default", tp_options, &g_env);
Ort::SessionOptions sessionOptions;
sessionOptions.SetGraphOptimizationLevel(GraphOptimizationLevel::ORT_ENABLE_ALL);
sessionOptions.DisableCpuMemArena();
sessionOptions.DisableMemPattern();
sessionOptions.DisableProfiling();
sessionOptions.DisablePerSessionThreads();
sessionOptions.SetExecutionMode(ORT_SEQUENTIAL);
sessionOptions.SetIntraOpNumThreads(num_threads);
sessionOptions.SetInterOpNumThreads(1);
sessionOptions.AddConfigEntry("session.load_model_format", "ORT");
sessionOptions.AddConfigEntry("session.use_ort_model_bytes_directly", "1");
sessionOptions.AddConfigEntry("session.intra_op.allow_spinning", "0");
sessionOptions.AddConfigEntry("session.inter_op.allow_spinning", "0");
session_ = Ort::Session(Ort::Env(g_env), modelData.data(), modelData.size(), sessionOptions);
It works fine when I set num_threads to 1.
I’d appreciate any feedback.
Thank you, Mario
Describe the issue
Hello,
I am exploring the use of ONNX, with a particular focus on the ORT model format for web applications. I developed a basic WASM module to perform inference using a UNET-like semantic segmentation model. However, the inference process throws an exception, which I have detailed below. Please note that the same code runs without issues outside of the WASM module.
I built the ONNX runtime for web with the following command:
I built the WASM module with the following command:
When running the inference I get the following error:
With SAFE_HEAP=0:
With SAFE_HEAP=1:
Best regards, Mario
To reproduce
Here is the code I used to test the ORT model:
Urgency
No response
ONNX Runtime Installation
Built from Source
ONNX Runtime Version or Commit ID
v1.17.1
Execution Provider
'wasm'/'cpu' (WebAssembly CPU)