emscripten-core / emsdk

Emscripten SDK
http://emscripten.org
Other
2.96k stars 676 forks source link

(Regression from 3.1.42 to 3.1.47) Undefined symbol emscripten_memcpy_big #1292

Closed bbrk24 closed 10 months ago

bbrk24 commented 10 months ago

This workflow failed, but this one passed. The only difference is the emsdk version (3.1.47 in the failing one, 3.1.42 in the passing one). The error in the former reads:

error: undefined symbol: emscripten_memcpy_big (referenced by root reference (e.g. compiled C/C++ code)) warning: To disable errors for undefined symbols use -sERROR_ON_UNDEFINED_SYMBOLS=0 warning: _emscripten_memcpy_big may need to be added to EXPORTED_FUNCTIONS if it arrives from a system library Error: Aborting compilation due to previous errors

sbc100 commented 10 months ago

IIRC that symbol was removed in #20144. Did you rebuild all of your object files and libraries?

sbc100 commented 10 months ago

From the workflow you shared I can't seem to see the emcc command that is failing. Can you share the full failing command? (It might be good idea to update the workflow to echo to commands, since it makes it easier to debug).

bbrk24 commented 10 months ago

This is the YAML file that describes the workflow. It invokes this shell script with no arguments, which calls emcc twice in parallel:

            emcc ../src/*.cpp "${common_emcc_args[@]}" -O3 -o worker.js &
            emcc ../src/*.cpp "${common_emcc_args[@]}" -Oz -o ldworker.js
            wait

common_emcc_args is an array with the following flags:

-WCL4 -Wnon-gcc -Wimplicit-fallthrough
-fno-rtti -fno-exceptions
-sEXPORTED_RUNTIME_METHODS=ccall -sASYNCIFY -sASYNCIFY_IMPORTS='send_debug_info'
--pre-js pre.js -sINCOMING_MODULE_JS_API='preInit,onRuntimeInitialized,noExitRuntime' --js-library library.js
-flto -DNDEBUG --closure 1 --closure-args='--emit_use_strict'
sbc100 commented 10 months ago

I can't see how any reference to emscripten_memcpy_big could remain in 3.1.47.

Are there is references to this symbol in your library.js?

Can you try building with EMCC_DEBUG=1 set and then attach the result? (e.g. prefix the failing emcc command with EMCC_DEBUG=1)

bbrk24 commented 10 months ago

library.js is itself the result of compilation, but here's what it looks like:

  mergeInto(LibraryManager.library, {
    send_thread_count: function(thread_count) {
      return postMessage([3, thread_count]);
    },
    send_debug_info: (thread_number, stack, stack_depth, y, x, instruction) => {
      return Asyncify.handleAsync(() => {
        postMessage([
          3,
          [
            thread_number,
            {
              'x': x,
              'y': y,
              'stack': Array.prototype.map.call(HEAP32.slice(stack >> 2,
            (stack >> 2) + stack_depth),
            function(x) {
                return x << 8 >> 8;
              })
            }
          ]
        ]);
        return new Promise((r) => {
          return this['__trilangle_resolve'] = r;
        });
      });
    }
  });

I'll try the EMCC_DEBUG thing in a moment.

bbrk24 commented 10 months ago

I created a new branch emsdk-3.1.47 to test this. You can see the run with EMCC_DEBUG=1 set (for both invocations of emcc) at https://github.com/bbrk24/Trilangle/actions/runs/6661028194/job/18103200555 (though at the time of writing it hasn't finished).

sbc100 commented 10 months ago

So we can see emscripten_memcpy_big being imported by the wasm module so it looks like a reference to it really is coming from native code, but I'm fairly sure its not coming from the emscripten stdlibs.

Can you try again without the EMCC_DEBUG=1 but with -Wl,--trace-symbol=emscripten_memcpy_big added the compiler flags?

You might also want to add set -x to the top of your shell script so you can see the commands as they are run.

bbrk24 commented 10 months ago

Done: https://github.com/bbrk24/Trilangle/actions/runs/6661196326/job/18103671807

Edit: It is coming from the stdlib!

/opt/hostedtoolcache/emscripten/latest/x64/upstream/emscripten/cache/sysroot/lib/wasm32-emscripten/lto/libc.a(emscripten_memcpy.o): reference to emscripten_memcpy_big
sbc100 commented 10 months ago

So that stdlib must be stale somehow. Perhaps it was built with an older version of emsdk?

It looks like emsdk does not include prebuilt versions of the lto libraries and instead they get built automatically at link time when they are first needed.

Can you reproduce this locally on your machine?

sbc100 commented 10 months ago

Do you know how /opt/hostedtoolcache/emscripten/latest/x64 is pupulated? Is it just a normal emsdk install?

bbrk24 commented 10 months ago

No, I have not been able to reproduce this locally. The lto libraries take a while to build so I cache them between runs (with actions/cache@v3). I guess the cache pulled in an older version of the library with a newer emsdk?

sbc100 commented 10 months ago

I'm guessing is the "cache" step in your workflow.. looks like maybe its restoring that lto directory from a previous build, which would explain the issue:

Run actions/cache@v3
  with:
    path: /opt/hostedtoolcache/emscripten/latest/x64/upstream/emscripten/cache/sysroot/lib/wasm32-emscripten/lto
    key: libs-cdbdee655701e94c2a9bbc9ed2a62d4b03a0c53eb29d4ce481b97465c72db0ae
    restore-keys: libs-

    enableCrossOsArchive: false
    fail-on-cache-miss: false
    lookup-only: false
  env:
    EMSDK: /opt/hostedtoolcache/emscripten/latest/x64
    EM_CONFIG: /opt/hostedtoolcache/emscripten/latest/x64/.emscripten
    EMSDK_NODE: /opt/hostedtoolcache/emscripten/latest/x64/node/16.20.0_64bit/bin/node
sbc100 commented 10 months ago

I guess you need to somehow include the sdk version in the cache key

bbrk24 commented 10 months ago

I'll note that the key is:

key: libs-${{ hashFiles('~/work/Trilangle/Trilangle/emsdk/upstream/emscripten/system/lib/**.c') }}

So any change to any C file (though I guess not the headers) in the lib folder would result in a different cache key.

bbrk24 commented 10 months ago

The problem likely lies in the restore keys. My logic behind that was that even if the library had changed it wouldn't have to recompile from scratch, but I guess emscripten blindly trusted the existing LTO folder.

My next question is, how did I not run into that same problem when upgrading from 3.1.42 to 3.1.47 locally? I didn't touch the LTO folder.

sbc100 commented 10 months ago

When you upgrade emsdk it completely blows away the existing cache. (In fact it blows away the entire "upstream/" directly which contains "upstream/emscripten/cache")

sbc100 commented 10 months ago

Rather than hashing any files I would just compare the emsdk version. Its not safe to share prebuilt libraries between sdk versions.

Seems like this issue is solved!

bbrk24 commented 10 months ago

Getting the EMSDK version into the key may be easier said than done, but that seems like the answer.