Open samsinsane opened 5 years ago
I've just updated to v1.38.30 and this is still occurring, is there anything I can do to help with this?
We can definitely use help here. There basically isn't a full-time maintainer of the Fetch API.
For this issue, I think rewriting the fetch worker to be just a normal pthread would be a good way to do, @juj proposed that.
Do you have a test case for the failing behavior?
I kind of have a test case (I just threw it together):
#include <stdio.h>
#include <string.h>
#include <pthread.h>
#include <emscripten/fetch.h>
void* threadFunc(void *userData)
{
for (int i = 0; i < 1000; i++)
{
emscripten_fetch_attr_t attr;
emscripten_fetch_attr_init(&attr);
strcpy(attr.requestMethod, "GET");
attr.attributes = EMSCRIPTEN_FETCH_LOAD_TO_MEMORY | EMSCRIPTEN_FETCH_SYNCHRONOUS;
emscripten_fetch_t *fetch = emscripten_fetch(&attr, "file.dat"); // Blocks here until the operation is complete.
if (fetch->status == 200)
{
printf("Finished downloading %llu bytes from URL %s.\n", fetch->numBytes, fetch->url);
// The data is now available at fetch->data[0] through fetch->data[fetch->numBytes-1];
}
else
{
printf("Downloading %s failed, HTTP failure status code: %d.\n", fetch->url, fetch->status);
}
emscripten_fetch_close(fetch);
}
return NULL;
}
int main(int argc, char *argv[])
{
pthread_t threads[8];
for (int i = 0; i < 8; i++)
{
printf("Starting up thread #%d\n", i + 1);
pthread_create(&threads[i], NULL, threadFunc, NULL);
}
return 0;
}
I'm really not sure what the VS plugin does, but I had to modify shared.py
as described in #8454 and then compile with:
emcc -lpthread -O0 -g -s FETCH=1 -s USE_PTHREADS=1 -s WASM=0 -s TOTAL_MEMORY=256MB -s ABORTING_MALLOC=0 test.cpp -o test.html
It might take awhile, but it will eventually trigger the error, Uncaught ReferenceError: ___pthread_mutex_timedlock is not defined
. As I mentioned above, it's not as simple as just adding the missing function because there are several and it will eventually require __pthread_ptr
. I think there are still other paths that trigger errors too, but I don't really recall as it was a few months ago that I went through all the functions.
@samsinsane Short term solution would be to add the missing functions to make_fetch_worker()
and to the internalized functions in llvm_opt()
I encountered this myself while testing a change
Btw @kripken @samsinsane I don't mind doing the grunt work of implementing the pthread code to replace make_fetch_worker()
, if you could just point me in the right direction I can churn out the correct code relatively quickly as I have free cycles this week.
Short term solution would be to add the missing functions to
make_fetch_worker()
and to the internalized functions inllvm_opt()
I encountered this myself while testing a change
How did you handle __pthread_ptr
?
Btw @kripken @samsinsane I don't mind doing the grunt work of implementing the pthread code to replace make_fetch_worker(),
That would be great! @juj knows best, but in general, I think the idea is just to remove all the make_fetch_worker
code, and instead in system/lib/fetch/emscripten_fetch.cpp
to just spin up a thread for the stuff the old fetch worker used to do, using normal pthreads calls.
Sadly my other diff seems like it will take up more time than i thought so I would not be able to do this this week.
Hey @samsinsane, I was able to get your test case working by adding '_pthread_mutextimedlock', 'pthread_mutex_trylock' to make_fetch_worker()
but I think you could have easily done that. I'm not encountering the __pthread_ptr
issue, I'm assuming there is something about that that prevents the quick fix from working in that case. If that's the case, I think the only real action item is to implement the worker thread in C.
I was able to get your test case working by adding '_pthread_mutextimedlock', 'pthread_mutex_trylock' to make_fetch_worker() but I think you could have easily done that.
Yes, I mentioned this in my first post but I'm not sure I was very clear about that.
I'm not encountering the __pthread_ptr issue, I'm assuming there is something about that that prevents the quick fix from working in that case.
Yes, the quick fix doesn't actually fix the project at work because there are specific branches internally to those missing functions that end up calling pthread_self()
which just returns __pthread_ptr
. I don't really know what that object contains but the contents of it are used in the mutex functions - or it could be part of another synchronization primitive, but I'm pretty sure it was the mutex.
I'm sure you can appreciate that I can't just provide the source code to the work project, and that trying to simplify a non-trivial system doesn't always provide the same results. The work project triggers the error in less than a second without the 10 or so missing functions being added, and with those functions added it can run for several minutes before triggering the error. It's really just a race condition and I can't give an exact sample for that.
If that's the case, I think the only real action item is to implement the worker thread in C.
Agreed, unfortunately I don't really understand the system enough to know how that would work. I attempted to write my own cut down XHR wrapper before posting this issue but sadly, there's a very clear gap in my understanding.
If that's the case, I think the only real action item is to implement the worker thread in C.
Agreed, unfortunately I don't really understand the system enough to know how that would work. I attempted to write my own cut down XHR wrapper before posting this issue but sadly, there's a very clear gap in my understanding.
Just to set expectations, I probably won't be able to get to this any time soon anymore
Possibly related to #8242? (Or the fix might be the same)
@VirtualTim Yeah, making it a real pthread is the fix - or not calling any C functions especially malloc
.
This issue has been automatically marked as stale because there has been no activity in the past year. It will be closed automatically if no further activity occurs in the next 7 days. Feel free to re-open at any time if this issue is still relevant.
I don't believe this is fixed.
This issue has been automatically marked as stale because there has been no activity in the past year. It will be closed automatically if no further activity occurs in the next 30 days. Feel free to re-open at any time if this issue is still relevant.
I don't think anything has changed here.
I've been attempting to port some of our projects to support Emscripten (v1.38.16), and I've started to encounter
Uncaught ReferenceError
s when both the worker thread and another thread are allocating at the same time. If a thread is already allocating when the Fetch worker attempts to allocate, it will fail the second conditional here: https://github.com/kripken/emscripten/blob/4bba062e95fb319ae9ba89797c88a0eef3dea1be/system/lib/libc/musl/src/thread/pthread_mutex_lock.c#L7-L9The Fetch worker will then attempt to call
__pthread_mutex_timedlock()
which will then result in this errorUncaught ReferenceError: ___pthread_mutex_timedlock is not defined
. I found thetools/shared.py
script which generates thefetch-worker.js
file and began adding all of the missing functions (as the reference errors came up) and ended up with:Unfortunately,
_pthread_self()
requires__pthread_ptr
which I believe requires the usage ofpthread_create()
? It seems like the solution is to make the worker thread more like a pthread as suggested in #7024?