Open ravisumit33 opened 1 week ago
@sbc100 @kripken Any thought on this?
To be clear this is not some kind of regression? i.e. you are not claiming that some previous version of emscripten had a faster version of MAIN_THREAD_EM_ASM
?
As far as I know there are no delays built into the proxying system. The call to MAIN_THREAD_EM_ASM
should use a postMessage
to wake the main which should then use a shared memory futex to wake the secondary thread once its done.
@tlively are you aware of any reason for such a delay?
@ravisumit33 perhaps you could share a example of simple program that demonstrates the delay you are talking about?
Are you doing anything on the main UI thread that is likely to be blocking it? i.e. are you doing synchronous proxying to your background thread? i.e. can you give more details on what you mean by "I used proxying to proxy events coming from UI to this detached thread"?
Sorry to not provide complete details about the issue. I am doing an async proxy to the detached thread. My main application thread isn't the main UI thread. I instantiate wasm in a web-worker.
To be clear this is not some kind of regression? i.e. you are not claiming that some previous version of emscripten had a faster version of
MAIN_THREAD_EM_ASM
?As far as I know there are no delays built into the proxying system. The call to
MAIN_THREAD_EM_ASM
should use apostMessage
to wake the main which should then use a shared memory futex to wake the secondary thread once its done.@tlively are you aware of any reason for such a delay?
@ravisumit33 perhaps you could share a example of simple program that demonstrates the delay you are talking about?
I will try to reproduce in a simple program. Just to be clear, delay isn't in proxying from main application thread to the detached thread. Delay comes in receiving the response from the background (detached) thread which is sending the response back in a synchronous way (MAIN_THREAD_EM_ASM
).
So you have the following JS contexts:
0: The main browser UI thread 1: The worker that starts your wasm program
main
function inside a pthread (due to PROXY_TO_PTHREAD).Is that correct?
I instantiate wasm in a web-worker
I think think this aspect could be a clue, since its not the most common setup. Can you explain a little more about this setup? I assume you create this worker using the normal new Worker
API and communicate with it solely through postMessage
to/from the main UI browser thread? (i.e. the main UI browser thread doesn't do any shared memory stuff?)
Yes list of JS contexts is correct. I create the worker instantiating wasm using new Worker
API as you mentioned and communicate with it through postMessage
from the main UI browser thread. The main UI browser thread doesn't do any shared memory stuff.
I have highlighted the delay in red rectangle below. As can be seen background thread (below one) is just wating till the main application thread (above one) has received the response. Also, main application thread is idle during the delay.
Instead of using MAIN_THREAD_EM_ASM
to communicate the results back, can you use emscripten_proxy_callback
, emscripten_proxy_callback_with_ctx
, emscripten_proxy_promise
, or emscripten_proxy_promise_with_ctx
? I don't know where the pause could be coming from, but these would be more direct methods of reporting the results.
An example program that demonstrates the issue would certainly be helpful.
I think we should try to get to the bottom of this since MAIN_THREAD_EM_ASM
shouldn't have this kind of delay. I agree a simple repro case would be great here.
By the way I see that you have .worker.js
in your filename. Does that mean you are using a version of emscripten before #21701 landed (this change removed the worker.js output file)? i.e. older than 3.1.58?
edit: I see you are using 3.1.56, would upgrading to the latest version be difficult?
Please include the following in your bug report:
Version of emscripten/emsdk:
I am trying to port my application from single-threaded to multi-threaded environment. I cannot ensure max number of threads required at a time by my application, thus I finalized using
PROXY_TO_PTHREAD
. In single-threaded mode, my application used to work like below:main
function does some initialization. Aftermain
function exits, we keep the runtime alive.To port this architecture into multi-threaded environment I used
PROXY_TO_PTHREAD
to create a proxied main thread and kept that thread alive for further processing. I used proxying to proxy events coming from UI to this detached thread. Once done, this thread calledMAIN_THREAD_EM_ASM
to send the response back to the main application thread. Also, this is the onlyMAIN_THREAD_EM_ASM
that the detached thread does. Rest is C++ execution without waiting on anything else.Functionality wise, this model worked well. But when doing performance analysis I figured out that I had a degradation of around 200-400 ms. Upon profiling, I could see that detached thread completed work in time but was waiting for around 200-400 ms for the
MAIN_THREAD_EM_ASM
to complete i.e. for main application thread to receive the response. Also, the main application thread was completey idle around this time. This can be seen in the below screenshot.Is this performance degradation expected? Is there any other way I could model my app to get away with this? How can I minimise the time taken by the detached thread to send back the response?