emscripten-core / emscripten

Emscripten: An LLVM-to-WebAssembly Compiler
Other
25.84k stars 3.31k forks source link

Fail to emscripten_sleep(0) after emscripten_webgl_commit_frame() with -sOFFSCREENCANVAS_SUPPORT -sASYNCIFY -sASSERTIONS #16983

Open chenzx opened 2 years ago

chenzx commented 2 years ago

Please include the following in your bug report:

Version of emscripten/emsdk: emcc (Emscripten gcc/clang-like replacement + linker emulating GNU ld) 3.1.10 (c3fe57af0504fe24fc4ba697feb8c204f3c80022) clang version 15.0.0 (https://github.com/llvm/llvm-project 8bc29d14273b05b05d5a56e34c07948dc2c770d3) Target: wasm32-unknown-emscripten Thread model: posix

I'm trying to fix the high CPU load in -sOFFSCREEN_FRAMEBUFFER mode with render pthread, since my render thread use a while loop without separate frame render iteration callback function, i want to first have a try with emscripten_sleep(0) after emscripten_webgl_commit_frame() solution, but failed:

Uncaught RuntimeError: memory access out of bounds
    at _emscripten_thread_sync_code (dynlink.c:239)
    at _emscripten_yield (emscripten_yield.c:45)
    at emscripten_futex_wait (emscripten_futex_wait.c:116)
    at __timedwait_cp (__timedwait.c:93)
    at __timedwait (__timedwait.c:116)
    at __pthread_mutex_timedlock (pthread_mutex_timedlock.c:89)
    at __pthread_mutex_lock (pthread_mutex_lock.c:9)
    at dlmalloc (dlmalloc.c:4638)
    at operator new(unsigned operator new(unsigned long) (new.cpp:67)

I dont know why this happened, since there is only 1 place I use emscripten_sleep(0) after emscripten_webgl_commit_frame() trick to yield to js main thread:

void CommGLContextImpl::commit() {
#ifdef __EMSCRIPTEN__
    EMSCRIPTEN_RESULT r = emscripten_webgl_commit_frame();
    assert(r == EMSCRIPTEN_RESULT_SUCCESS);

    //see https://github.com/emscripten-core/emscripten/issues/10309
    // "explicit swapping will be restricted to the OFFSCREEN_FRAMEBUFFER feature - and in OFFSCREEN_CANVAS, to swap one will need to yield back to browser event loop"
    //also see https://emscripten.org/docs/porting/emscripten-runtime-environment.html
    // "Another option is to use Asyncify which will rewrite the program so that it can return to the browser’s main event loop by just calling emscripten_sleep()."
    emscripten_sleep(0); //needs -sASYNCIFY
#else

I had enabled emcc linker flags: -O3 -g -sOFFSCREENCANVAS_SUPPORT -sASYNCIFY -sASSERTIONS ...

chenzx commented 2 years ago

Try increase async stack size, since default 4096 is too small: -sASYNCIFY_STACK_SIZE=16777216 Then:

Uncaught RuntimeError: memory access out of bounds
    at _emscripten_thread_sync_code (dynlink.c:239)
    at _emscripten_yield (emscripten_yield.c:45)
    at emscripten_futex_wait (emscripten_futex_wait.c:116)
    at emscripten_wait_for_call_v (library_pthread.c:333)
    at emscripten_sync_run_in_main_thread (library_pthread.c:402)
    at emscripten_run_in_main_runtime_thread_js (library_pthread.c:547)
    at ret.<computed> (agilengine_ui_preview.js:6611:35)
    at agilengine_ui_preview.js:1813:22
    at agilengine_ui_preview.js:10545:16
    at withStackSave (agilengine_ui_preview.js:2354:17)
chenzx commented 2 years ago

Increase to ``, got a new abort error:

Uncaught RuntimeError: Aborted(native code called abort())
    at abort (agilengine_ui_preview.js:1768:11)
    at _abort (agilengine_ui_preview.js:8063:7)
    at imports.<computed> (agilengine_ui_preview.js:6575:35)
    at std::__2::__throw_system_error(int, char std::__2::__throw_system_error(int, char const*) (system_error.cpp:291)
    at std::__2::chrono::__libcpp_steady_clock_now() (chrono.cpp:247)
    at std::__2::chrono::steady_clock::now() (chrono.cpp:260)
...

Looks Asyncify not only increase .wasm file's size a lot(event with -O3), but it can not do as it said, "to yield to js main thread in a sync context", I got have to give up... sigh

kripken commented 2 years ago

Code size is indeed a big downside atm. But wasm stack switching will give the benefits of Asyncify without that overhead. It is close to testable in VMs, see #16779

I'm not sure what that abort in chrome::steady_clock::now() is from. Maybe the line imports.<computed> (agilengine_ui_preview.js:6575:35) in the stack trace might help figure it out though.

chenzx commented 2 years ago

The original emscripten_webgl_commit_frame() call position in legacy C++ code's render thread stack is very deep, so i hesitate for 2 things:

1、is that emscripten_sleep(0); need be called at stack top level? 2、Or is that because i'm using a C++ member function, which in other side Asyncify's sample uses C function?

PS: I've refactored the legacy C++ code's render thread while loop to emscripten_set_main_loop_arg(&frame_render_itreration_callback, -1, -1, this); and got -sOFFSCREENCANVAS_SUPPORT mode to run successfully. So I'm not bothering with Asyncify any more.

chenzx commented 2 years ago

The reason I didn't post the agilengine_ui_preview.wasm C++ code lines is it doesn't matter here... I've no time to build a minimal test case for validate Asyncify, and i thought it's not very mature enough

kripken commented 2 years ago

1、is that emscripten_sleep(0); need be called at stack top level? 2、Or is that because i'm using a C++ member function, which in other side Asyncify's sample uses C function?

I don't think either should be a problem, so not sure what the issue was. But anyhow, good that you got things working without Asyncify, that will have a lot less overhead.

chenzx commented 2 years ago

I thought there should only 1 solution to GL rendering in pthread: that is, -sOFFSCREENCANVAS_SUPPORT with emscripten_set_main_loop_arg trick. Which also makes documentation and maintenance easier.

Other 2 methods are either buggy or has performance issues.