emscripten-core / emscripten

Emscripten: An LLVM-to-WebAssembly Compiler
Other
25.83k stars 3.31k forks source link

argv handling broken in 3.1.20 #17720

Closed tiran closed 2 years ago

tiran commented 2 years ago

Version of emscripten/emsdk:

emcc (Emscripten gcc/clang-like replacement + linker emulating GNU ld) 3.1.20-git (655ad88e65298014a2a37aae88a5b2f9ab3e36f7)
clang version 16.0.0 (https://github.com/llvm/llvm-project 75767a0f9a926641edbef08e31ec2148ff45da67)
Target: wasm32-unknown-emscripten
Thread model: posix
InstalledDir: /home/heimes/dev/wasm/emsdk/upstream/bin

The passing of argv is broken again on Emscripten tot-upstream. The issue is similar to https://github.com/emscripten-core/emscripten/issues/17338 . The bug was likely introduced in the last 24 to 48 hours. I'm running CPython builds and tests with Emscripten latest and tot-upstream in my project https://github.com/tiran/cpython-wasm-test . The previous nightly run was fine, today's nightly run failed for builds with pthreads.

tiran commented 2 years ago

git bisect suggests that c3b37c69589bb6b8bb79d41a4e7e27fe8e83d2ec / #17702 broke argv passing for me.

sbc100 commented 2 years ago

Seems likely yes. We try to detect if main is declared with arguments (argc, argv) or not, and somehow I guess we are getting it wrong in your case.

Can you help is debug what that might be? This works for me with simple hello world examples. If you link your program with EMCC_DEBUG=1 it should show 'mainReadsParams': True or 'mainReadsParams': False in the under Metadata in the debug output.

I'm guessing the detection is not working and 'mainReadsParams': False is being reported? But for me I see True, even whne using -sPROXY_TO_PTHREAD -sUSE_PTHREADS. If you could attach the output of the linker that help a lot (you can find that in /tmp/emscripten_temp/emcc-0-base.wasm after linking with EMCC_DEBUG=1)

tiran commented 2 years ago

Sure!

FWIW an MEMORY64 build is also failing the same way.

tiran commented 2 years ago

cpython-log.zip

emscripten:DEBUG: Metadata: {'asmConsts': {},
 'declares': ['exit',
              'abort',
              'getentropy',
              'getaddrinfo',
              'gethostbyname',
              'gethostbyaddr',
              'getprotobyname',
              'getnameinfo',
              '__assert_fail',
              'system',
              'getloadavg',
              'strftime',
              'emscripten_runtime_keepalive_check',
              'environ_sizes_get',
              'environ_get',
              '__syscall_fcntl64',
              '__syscall_ioctl',
              'fd_close',
              'fd_read',
              'fd_write',
              'proc_exit',
              '__syscall_faccessat',
              '__syscall_chdir',
              '__syscall_chmod',
              '__syscall_fchownat',
              '__syscall_openat',
              '__syscall_fstat64',
              '__syscall_dup3',
              '_dlopen_js',
              '_dlsym_js',
              '_dlinit',
              'emscripten_get_now',
              'emscripten_memcpy_big',
              '_tzset_js',
              '_mktime_js',
              '_localtime_js',
              '_gmtime_js',
              '_emscripten_date_now',
              '_emscripten_get_now_is_monotonic',
              'emscripten_get_now_res',
              'emscripten_unwind_to_js_event_loop',
              '__syscall_fchdir',
              '__syscall_fchmod',
              '__syscall_fchmodat',
              '__syscall_newfstatat',
              '__syscall_fchown32',
              '__syscall_fdatasync',
              '__syscall_stat64',
              '__syscall_lstat64',
              'fd_sync',
              '__syscall_ftruncate64',
              '__syscall_getcwd',
              'emscripten_console_error',
              'fd_fdstat_get',
              'emscripten_receive_on_main_thread_js',
              '_emscripten_set_offscreencanvas_size',
              '__emscripten_init_main_thread_js',
              'fd_seek',
              '__syscall_mkdirat',
              '_munmap_js',
              '_msync_js',
              '_mmap_js',
              '__syscall_pipe',
              '__syscall_poll',
              '__syscall_fadvise64',
              '__syscall_fallocate',
              'fd_pread',
              '_emscripten_notify_task_queue',
              'emscripten_check_blocking_allowed',
              '_emscripten_default_pthread_stack_size',
              '__pthread_create_js',
              '__emscripten_thread_cleanup',
              'fd_pwrite',
              '__call_sighandler',
              '__syscall_getdents64',
              '__syscall_readlinkat',
              '__syscall_renameat',
              '__syscall_rmdir',
              '__syscall__newselect',
              '__syscall_statfs64',
              '__syscall_fstatfs64',
              '__syscall_symlink',
              'emscripten_num_logical_cores',
              'emscripten_get_heap_max',
              '__syscall_truncate64',
              '__syscall_unlinkat',
              '__syscall_utimensat',
              'emscripten_resize_heap',
              '__syscall_accept4',
              '__syscall_bind',
              '__syscall_connect',
              '__syscall_getpeername',
              '__syscall_getsockname',
              '__syscall_getsockopt',
              '__syscall_listen',
              '__syscall_recvfrom',
              '__syscall_recvmsg',
              '__syscall_sendmsg',
              '__syscall_sendto',
              '__syscall_socket'],
 'emJsFuncs': {'_Py_CheckEmscriptenSignals_Helper': '(void)<::>{ if '
                                                    '(!Module.Py_EmscriptenSignalBuffer) '
                                                    '{ return 0; } try { let '
                                                    'result = '
                                                    'Module.Py_EmscriptenSignalBuffer[0]; '
                                                    'Module.Py_EmscriptenSignalBuffer[0] '
                                                    '= 0; return result; } '
                                                    'catch(e) { return 0; } }',

               '_Py_emscripten_runtime': '(void)<::>{ var info; if (typeof '
                                         "navigator == 'object') { info = "
                                         'navigator.userAgent; } else if '
                                         "(typeof process == 'object') { info "
                                         '= "Node.js '
                                         '".concat(process.version); } else { '
                                         'info = "UNKNOWN"; } var len = '
                                         'lengthBytesUTF8(info) + 1; var res = '
                                         '_malloc(len); if (res) '
                                         'stringToUTF8(info, res, len); return '
                                         'res; }'},
 'exports': ['__wasm_call_ctors',
             '__main_argc_argv',
             '__errno_location',
             'fflush',
             'malloc',
             'free',
             'pthread_self',
             'ntohs',
             'htons',
             'htonl',
             '_emscripten_tls_init',
             'emscripten_builtin_memalign',
             '_emscripten_proxy_main',
             'emscripten_stack_get_base',
             'emscripten_stack_get_end',
             '__funcs_on_exit',
             '__dl_seterr',
             '_emscripten_thread_init',
             '_emscripten_thread_crashed',
             'emscripten_main_thread_process_queued_calls',
             'emscripten_main_browser_thread_id',
             'emscripten_run_in_main_runtime_thread_js',
             'emscripten_dispatch_to_thread_',
             '_emscripten_proxy_execute_task_queue',
             '_emscripten_thread_free_data',
             '_emscripten_thread_exit',
             'emscripten_stack_init',
             'emscripten_stack_set_limits',
             'emscripten_stack_get_free',
             'stackSave',
             'stackRestore',
             'stackAlloc'],
 'features': ['--enable-threads',
              '--enable-bulk-memory',
              '--enable-mutable-globals',
              '--enable-sign-ext'],
 'globalImports': [],
 'invokeFuncs': [],
 'mainReadsParams': False,
 'namedGlobals': {'Py_EMSCRIPTEN_SIGNAL_HANDLING': '3525696',
                  '__start_em_js': '3519384',
                  '__stop_em_js': '3519901'}}
tiran commented 2 years ago

Python's main function is a trivial function that calls another function:

int
main(int argc, char **argv)
{
    return Py_BytesMain(argc, argv);
}

I guess the new heuristic does not take into account that main does not contain any code except for a function call. Here is a simpler reproducer that does not need pthreads. It needs two separate object files to trigger the problem, though.

// main.c
extern int realmain(int, char**);

int main(int argc, char **argv) {
    return realmain(argc, argv);
}
// realmain.c
#include <stdio.h>

int
realmain(int argc, char **argv) {
    printf("argc: %i\n", argc);
    return 0;
}
emcc -O2 -c main.c -o main.o
emcc -O2 -c realmain.c -o realmain.o
emcc -O2 -sEXIT_RUNTIME main.o realmain.o -o test.js
node test.js

The reproducer prints argc: 0 with tot-upstream and argc: 1 with 3.1.19.

sbc100 commented 2 years ago

Thanks I see the bug now!