Closed jnickg closed 1 year ago
I would have guessed #18479 might be related. But based on the versions you mention, perhaps not.
Maybe @sbc100 has an idea?
Otherwise, bisecting might find a quick answer here, as this sounds like a regression.
I bisected the releases versions to figure out the details I originally posted, but should be able to bisect commits as well. Unfortunately due to a separate issue (still ruling out errors on my end) it will have to be manual, so it might take a day or two. I'll follow up ASAP!
How are you creating the your pthreads? Are you passing a pthread_attr_t? If so, how are you initializing it?
The idea with https://github.com/emscripten-core/emscripten/blob/516060b/system/lib/pthread/emscripten_thread_init.c#L19 is that it sets __default_stacksize
exactly once at startup when the main thread starts. It should be set to the value of DEFAULT_PTHREAD_STACK_SIZE..
If you open up your generated JS file, can you take a look at ___emscripten_init_main_thread_js
and see what is being passed as the final argument to __emscripten_thread_init
Oh I see that problem is that you are never running the code on the main browser thread.. I will fix https://github.com/emscripten-core/emscripten/blob/516060b/system/lib/pthread/emscripten_thread_init.c#L19 accordingly
How are you creating the your pthreads? Are you passing a pthread_attr_t? If so, how are you initializing it?
We create threads in a couple ways. One, using std::thread
which was no interface for explicitly setting stack size. In those cases, we rely entirely on Emscripten plumbing to make sure those threads are started with the default stack size. In other cases an in-house library we depend on initializes threads using the pthread interface, and our thread-start code looks like this:
void MyThreadWrapper::Start(uint32_t stackSize) {
const uint32 kMinStackSize = 512 * 1024;
int result;
pthread_attr_t attrs;
result = pthread_attr_init (&attrs);
if (result != 0) {
throw std::runtime_error("failed to init pthread attributes");
}
if (stackSize != 0) {
pthread_attr_setstacksize (&attrs,
MAX(stackSize,
kMinStackSize));
}
result = pthread_create (&mMyThreadHandle, &attrs, MyThreadWorkerLoop, (void*)this);
pthread_attr_destroy (&attrs);
if (result != 0) {
throw std::runtime_error("failed to create thread!");
}
mMyThreadIsStarted = true;
}
In the failing threads, we pass stackSize=0
into this function, meaning we do not explicitly set attributes, and rely on system defaults. There may be other subsystems using this code whose and it is a large code base with many internal customers, so I want to avoid changing this unless absolutely necessary. It seems like, if defaults are honored, we shouldn't need to change this. Let me know if that doesn't sound right 👍
I found the problem, and the fix is in #19577.
is there any chance you could patch in #19577 to verify that it fixed both your issues?
Will do, I will get back to you later today with results.
We're using EMSDK in our build system, via some tools that basically call emsdk install $EMSDK_VERSION
and then emsdk activate $EMSDK_VERSION
, then calling source path/to/emsdk/submodule/emsdk_env.sh
before building. Can't seem to get this working in a way that uses the branch in #19577...
Here's what I tried:
sdk-upstream-main-64bit
as called out here, so that our pre-build steps effectively call emsdk install/activate sdk-upstream-main-64bit
cd path/to/emsdk/submodule/emscripten/main
git checkout -t origin/fix_pthread_default_stack_size
emsdk install
and emsdk activate
.This seemed to revert back to main, so I tried the same steps but creating a temporary main
branch that pointed to origin/fix_pthread_default_stack_size
. Did not fix the issue, but I am not confident the patch was actually applied.
Tried this process again, this time creating a commit that added a #error
inside emscripten_thread_init.c to see if it was ever compiled. It is not.
Probably user error, but I am confident that our side of things works since this is what I used to bisect release commits in chromium/emscripten-releases
. So the problem presumably exists somewhere between setting the right value for EMSDK_VERSION
and making sure the emscripten/main
submodule is actually recompiled on the correct branch.
Any pointers on what I might be doing wrong?
Also, based on the bisect results above, do you still expect that #19577 will fix both issues? Just wanted to ask explicitly
Yes I think #19577 should solve you issues.
Can you trying running emcc --clear-cache
before testing?
Hmm, looks like this got closed as verified somehow. I guess I can open a new issue we we can reopen if this doesn't fix this.
Can you trying running
emcc --clear-cache
before testing?
Can you clarify when this command should be run? Before I call emsdk activate
? Right before I build? Some other time?
We can re-open this issue if needed
Any time after you active the SDK should be fine.
OK, I must be doing something very wrong because I still can't hack in a #error
to verify that the change to _emscripten_thread_init
is being used. But that now that you've merged, I can build against plain-old sdk-upstream-main-64bit
and everything works, whereas it did not before (presumably because the fix wasn't present). So it looks like things are fixed on my end. Does that mean it will go into 3.1.42
? Or will it be a later version?
yes that fix will be in the next release (3.1.42
). You can also try it out right away without waiting by install the tip-of-tree sdk (./emsdk install tot
).
Saw that not long after my last comment, and tested/verified with it. Thanks for all the help with this!!! 🎊
Trying to update a large project to the latest EMSDK version, and I discovered some strange behavior. As late as EMSDK
3.1.34
, the threads instantiated by my wasm module (which is spun up in a web worker) roughly honored flags likeSTACK_SIZE
PKATOTAL_STACK
, andDEFAULT_PTHREAD_STACK_SIZE
. For example, if I setDEFAULT_PTHREAD_STACK_SIZE=2MB
I would see viaPTHREADS_DEBUG
instrumentation that the stack range shown byestablishStackSpace
messages was the correct amount. Sometimes values deviated, I'm guessing due to heuristics or something, but generally they were in the order of magnitude I requested, so it appeared the system was attempting to honor the given flags.Starting in
3.1.35
, threads made through the rawpthread
interface stopped honoring this. Perhaps related to this line of code? I am guessingis_main
evaluates to false on a web worker. Instead the stack size was 64 KiB (perhaps related to this change?) However, threads started up via thestd::thread
interface still had the correct stack sizes.Then, starting in
3.1.39
, for reasons I am still unsure of, threads started viastd::thread
(which BTW has no interface for changing stack size programmatically) ALSO started ignoring stack-size flags. The stack was instead 64 KiB.Full link command (some of them are artifacts from a CMake dependency with public link flags, included below for completeness):
Extra detail: In case it's relevant somehow, we are building through CMake. We were updating from
3.1.8
, and ran into issues with the latest version. To reproduce this issue, I built/ran our module with different EMSDK versions, changing flags as little as possible. We were able to update to3.1.34
and get a stable build with the flags above, only having to remove deprecated flags (which I can dig up if you need them), and changeTOTAL_STACK
toSTACK_SIZE
.You can probably guess, but we use large stack values because our software makes ample use of stack space. Which means that when it's ignored and set to 64 KiB, we quickly crash. Where exactly we crash varies since this is related to threading, but it was usually when locking a mutex, creating another thread, or calling
pthread_getspecific()
/pthread_setspecific()