emscripten-core / emscripten

Emscripten: An LLVM-to-WebAssembly Compiler
Other
25.63k stars 3.28k forks source link

Fetch API with pthreads has missing symbols #7490

Open samsinsane opened 5 years ago

samsinsane commented 5 years ago

I've been attempting to port some of our projects to support Emscripten (v1.38.16), and I've started to encounter Uncaught ReferenceErrors when both the worker thread and another thread are allocating at the same time. If a thread is already allocating when the Fetch worker attempts to allocate, it will fail the second conditional here: https://github.com/kripken/emscripten/blob/4bba062e95fb319ae9ba89797c88a0eef3dea1be/system/lib/libc/musl/src/thread/pthread_mutex_lock.c#L7-L9

The Fetch worker will then attempt to call __pthread_mutex_timedlock() which will then result in this error Uncaught ReferenceError: ___pthread_mutex_timedlock is not defined. I found the tools/shared.py script which generates the fetch-worker.js file and began adding all of the missing functions (as the reference errors came up) and ended up with:

  funcs_to_import = ['alignUp', 'getTotalMemory', 'stringToUTF8', 'intArrayFromString', 'lengthBytesUTF8', 'stringToUTF8Array', '_emscripten_is_main_runtime_thread', '_emscripten_futex_wait', '_emscripten_futex_wake', '_pthread_self']
  asm_funcs_to_import = ['_malloc', '_free', '_sbrk', '___pthread_mutex_lock', '___pthread_mutex_unlock', '___pthread_mutex_timedlock', '___pthread_mutex_trylock', '___timedwait', '___pthread_setcancelstate']

Unfortunately, _pthread_self() requires __pthread_ptr which I believe requires the usage of pthread_create()? It seems like the solution is to make the worker thread more like a pthread as suggested in #7024?

samsinsane commented 5 years ago

I've just updated to v1.38.30 and this is still occurring, is there anything I can do to help with this?

kripken commented 5 years ago

We can definitely use help here. There basically isn't a full-time maintainer of the Fetch API.

For this issue, I think rewriting the fetch worker to be just a normal pthread would be a good way to do, @juj proposed that.

juj commented 5 years ago

Do you have a test case for the failing behavior?

samsinsane commented 5 years ago

I kind of have a test case (I just threw it together):

#include <stdio.h>
#include <string.h>
#include <pthread.h>
#include <emscripten/fetch.h>

void* threadFunc(void *userData)
{
    for (int i = 0; i < 1000; i++)
    {
        emscripten_fetch_attr_t attr;
        emscripten_fetch_attr_init(&attr);
        strcpy(attr.requestMethod, "GET");
        attr.attributes = EMSCRIPTEN_FETCH_LOAD_TO_MEMORY | EMSCRIPTEN_FETCH_SYNCHRONOUS;
        emscripten_fetch_t *fetch = emscripten_fetch(&attr, "file.dat"); // Blocks here until the operation is complete.
        if (fetch->status == 200)
        {
            printf("Finished downloading %llu bytes from URL %s.\n", fetch->numBytes, fetch->url);
            // The data is now available at fetch->data[0] through fetch->data[fetch->numBytes-1];
        }
        else
        {
            printf("Downloading %s failed, HTTP failure status code: %d.\n", fetch->url, fetch->status);
        }
        emscripten_fetch_close(fetch);
    }

    return NULL;
}

int main(int argc, char *argv[])
{
    pthread_t threads[8];
    for (int i = 0; i < 8; i++)
    {
        printf("Starting up thread #%d\n", i + 1);
        pthread_create(&threads[i], NULL, threadFunc, NULL);
    }

    return 0;
}

I'm really not sure what the VS plugin does, but I had to modify shared.py as described in #8454 and then compile with:

emcc -lpthread -O0 -g -s FETCH=1 -s USE_PTHREADS=1 -s WASM=0 -s TOTAL_MEMORY=256MB -s ABORTING_MALLOC=0 test.cpp -o test.html

It might take awhile, but it will eventually trigger the error, Uncaught ReferenceError: ___pthread_mutex_timedlock is not defined. As I mentioned above, it's not as simple as just adding the missing function because there are several and it will eventually require __pthread_ptr. I think there are still other paths that trigger errors too, but I don't really recall as it was a few months ago that I went through all the functions.

rianhunter commented 5 years ago

@samsinsane Short term solution would be to add the missing functions to make_fetch_worker() and to the internalized functions in llvm_opt() I encountered this myself while testing a change

rianhunter commented 5 years ago

Btw @kripken @samsinsane I don't mind doing the grunt work of implementing the pthread code to replace make_fetch_worker(), if you could just point me in the right direction I can churn out the correct code relatively quickly as I have free cycles this week.

samsinsane commented 5 years ago

Short term solution would be to add the missing functions to make_fetch_worker() and to the internalized functions in llvm_opt() I encountered this myself while testing a change

How did you handle __pthread_ptr?

kripken commented 5 years ago

Btw @kripken @samsinsane I don't mind doing the grunt work of implementing the pthread code to replace make_fetch_worker(),

That would be great! @juj knows best, but in general, I think the idea is just to remove all the make_fetch_worker code, and instead in system/lib/fetch/emscripten_fetch.cpp to just spin up a thread for the stuff the old fetch worker used to do, using normal pthreads calls.

rianhunter commented 5 years ago

Sadly my other diff seems like it will take up more time than i thought so I would not be able to do this this week.

rianhunter commented 5 years ago

Hey @samsinsane, I was able to get your test case working by adding '_pthread_mutextimedlock', 'pthread_mutex_trylock' to make_fetch_worker() but I think you could have easily done that. I'm not encountering the __pthread_ptr issue, I'm assuming there is something about that that prevents the quick fix from working in that case. If that's the case, I think the only real action item is to implement the worker thread in C.

samsinsane commented 5 years ago

I was able to get your test case working by adding '_pthread_mutextimedlock', 'pthread_mutex_trylock' to make_fetch_worker() but I think you could have easily done that.

Yes, I mentioned this in my first post but I'm not sure I was very clear about that.

I'm not encountering the __pthread_ptr issue, I'm assuming there is something about that that prevents the quick fix from working in that case.

Yes, the quick fix doesn't actually fix the project at work because there are specific branches internally to those missing functions that end up calling pthread_self() which just returns __pthread_ptr. I don't really know what that object contains but the contents of it are used in the mutex functions - or it could be part of another synchronization primitive, but I'm pretty sure it was the mutex.

I'm sure you can appreciate that I can't just provide the source code to the work project, and that trying to simplify a non-trivial system doesn't always provide the same results. The work project triggers the error in less than a second without the 10 or so missing functions being added, and with those functions added it can run for several minutes before triggering the error. It's really just a race condition and I can't give an exact sample for that.

If that's the case, I think the only real action item is to implement the worker thread in C.

Agreed, unfortunately I don't really understand the system enough to know how that would work. I attempted to write my own cut down XHR wrapper before posting this issue but sadly, there's a very clear gap in my understanding.

rianhunter commented 5 years ago

If that's the case, I think the only real action item is to implement the worker thread in C.

Agreed, unfortunately I don't really understand the system enough to know how that would work. I attempted to write my own cut down XHR wrapper before posting this issue but sadly, there's a very clear gap in my understanding.

Just to set expectations, I probably won't be able to get to this any time soon anymore

VirtualTim commented 5 years ago

Possibly related to #8242? (Or the fix might be the same)

samsinsane commented 5 years ago

@VirtualTim Yeah, making it a real pthread is the fix - or not calling any C functions especially malloc.

stale[bot] commented 4 years ago

This issue has been automatically marked as stale because there has been no activity in the past year. It will be closed automatically if no further activity occurs in the next 7 days. Feel free to re-open at any time if this issue is still relevant.

samsinsane commented 4 years ago

I don't believe this is fixed.

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because there has been no activity in the past year. It will be closed automatically if no further activity occurs in the next 30 days. Feel free to re-open at any time if this issue is still relevant.

sbc100 commented 3 years ago

I don't think anything has changed here.