emscripten-core / emscripten

Emscripten: An LLVM-to-WebAssembly Compiler
Other
25.91k stars 3.32k forks source link

LLVM ERROR: Cannot select... #5070

Closed phraemer closed 7 years ago

phraemer commented 7 years ago

With the incoming branch I'm getting a pretty useless error which I can't even pinpoint what code it's coming from, even with llvm-dis

So I tried switching to the upstream LLVM by building it myself and changing LLVM_ROOT in ~/.emscripten to point to it.

My project is CMake based. I added -s BINARYEN=1 to CMAKE_CXX_FLAGS and -s BINARYEN=1 -s \"BINARYEN_METHOD='interpret-binary'\" to CMAKE_EXE_LINKER_FLAGS

the .bc file is generated fine it seems but then from llc I get the error LLVM ERROR: Function addresses with offsets not supported.

The call to llc generated by emscripten looks like llc myapp.bc -march=wasm32 -filetype=asm -asm-verbose=false -o path_to_temp.wb.s -thread-model=single -combiner-global-alias-analysis=false -enable-emscripten-sjlj

Through some experimentation I found that dropping -thread-model=single gave me this error message...

LLVM ERROR: Cannot select: t42: i32,ch = AtomicLoad<Volatile LD1[bitcast (i32* @_ZGVZN5boost16exception_detail27get_static_exception_objectINS0_14bad_exception_EEENS_13exception_ptrEvE2ep to i8*)](align=4)> t39, t48
  t48: i32 = WebAssemblyISD::Wrapper TargetGlobalAddress:i32<i32* @_ZGVZN5boost16exception_detail27get_static_exception_objectINS0_14bad_exception_EEENS_13exception_ptrEvE2ep> 0
    t47: i32 = TargetGlobalAddress<i32* @_ZGVZN5boost16exception_detail27get_static_exception_objectINS0_14bad_exception_EEENS_13exception_ptrEvE2ep> 0
In function: _ZN5boost16exception_detail27get_static_exception_objectINS0_14bad_exception_EEENS_13exception_ptrEv

I'm using some Boost but all files are compiled as part of the project and linked in statically.

I don't understand what that bitcast is but is it the cause of the issue?

kripken commented 7 years ago

Sounds like this is a current limitation in the LLVM wasm backend, Function addresses with offsets not supported indicates it can't handle offsetting a function address in the data section, I am guessing.

(The issues with atomics are expected, there is no atomics support yet, that is why the thread model is single, to get rid of them, so changing that can lead to errors like you saw.)

cc @sunfishcode, @dschuff for the LLVM wasm backend. Not sure if they prefer filing bugs here or there, btw?

kripken commented 7 years ago

Regarding

With the incoming branch I'm getting a pretty useless error which I can't even pinpoint what code it's coming from, even with llvm-dis

it would be good to figure that out, as while the LLVM wasm backend is still in development, there are no major bugs known about the default code generation path. Do you have a testcase showing the issue?

dschuff commented 7 years ago

If you're using the wasm backend with emscripten, it probably makes sense to file bugs here, since any bugs could be related to LLVM itself or to LLVM's integration with emscripten (which is where most of the current known bugs and limitations are anyway). @phraemer chatted briefly with us on IRC, but to recap, we'd be very interested in seeing how function addresses with offsets are getting generated; bitcode reproducer would likely be helpful, a C++ reproducer would be even better. Also, the error related to changing the thread model away from single is expected (we use single thread model because we can't support atomic operations yet) and unrelated to the original issue.

phraemer commented 7 years ago

I'm running bugpoint right now (since yesterday!). Will try hard to get this identified.

phraemer commented 7 years ago

it would be good to figure that out, as while the LLVM wasm backend is still in development, there are no major bugs known about the default code generation path. Do you have a testcase showing the issue?

I'm working on trying to isolate this case. But just to give you and idea, here's the error I get (non-wasm emscripten build)

  %85 = extractvalue { i33, i1 } %83, 0
unhandled instruction

Full context...

DEBUG:root:LLVM => JS
DEBUG:root:emscript: llvm backend: /Users/james/Documents/work/wasm/emsdk/clang/fastcomp/build_incoming_64/bin/llc /var/folders/yl/vyrpz7vj1zvbw7dmv_cqbksw0000gn/T/tmprVHv5u/myapp.bc -march=js -filetype=asm -o /var/folders/yl/vyrpz7vj1zvbw7dmv_cqbksw0000gn/T/emscripten_temp/tmpG9fBXn.4.js -emscripten-global-base=8 -emscripten-stack-size=5242880 -O2
  %85 = extractvalue { i33, i1 } %83, 0
unhandled instruction
UNREACHABLE executed at /Users/james/Documents/work/wasm/emsdk/clang/fastcomp/src/lib/Target/JSBackend/NaCl/PromoteIntegers.cpp:631!
0  llc                      0x000000010df4ab88 llvm::sys::PrintStackTrace(llvm::raw_ostream&) + 40
1  llc                      0x000000010df49d36 llvm::sys::RunSignalHandlers() + 86
2  llc                      0x000000010df4b229 SignalHandler(int) + 361
3  libsystem_platform.dylib 0x00007fffa26debba _sigtramp + 26
4  llc                      0x000000010e4116d2 llvm::NaNL + 79195
5  libsystem_c.dylib        0x00007fffa2565420 abort + 129
6  llc                      0x000000010deeb1f7 llvm::llvm_unreachable_internal(char const*, char const*, unsigned int) + 471
7  llc                      0x000000010e019ccc (anonymous namespace)::PromoteIntegers::runOnModule(llvm::Module&) + 11660
8  llc                      0x000000010db56ffe llvm::legacy::PassManagerImpl::run(llvm::Module&) + 766
9  llc                      0x000000010d35e452 compileModule(char**, llvm::LLVMContext&) + 11634
10 llc                      0x000000010d35b58b main + 331
11 libdyld.dylib            0x00007fffa24d1255 start + 1
Stack dump:
0.  Program arguments: /Users/james/Documents/work/wasm/emsdk/clang/fastcomp/build_incoming_64/bin/llc /var/folders/yl/vyrpz7vj1zvbw7dmv_cqbksw0000gn/T/tmprVHv5u/myapp.bc -march=js -filetype=asm -o /var/folders/yl/vyrpz7vj1zvbw7dmv_cqbksw0000gn/T/emscripten_temp/tmpG9fBXn.4.js -emscripten-global-base=8 -emscripten-stack-size=5242880 -O2
1.  Running pass 'Promote integer types which are illegal in PNaCl' on module '/var/folders/yl/vyrpz7vj1zvbw7dmv_cqbksw0000gn/T/tmprVHv5u/myapp.bc'.
DEBUG:root:  emscript: llvm backend took 8.12070608139 seconds
Traceback (most recent call last):
  File "/Users/james/Documents/work/wasm/emsdk/emscripten/incoming/em++", line 16, in <module>
    emcc.run()
  File "/Users/james/Documents/work/wasm/emsdk/emscripten/incoming/emcc.py", line 1673, in run
    final = shared.Building.emscripten(final, append_ext=False, extra_args=extra_args)
  File "/Users/james/Documents/work/wasm/emsdk/emscripten/incoming/tools/shared.py", line 1963, in emscripten
    call_emscripten(cmdline)
  File "/Users/james/Documents/work/wasm/emsdk/emscripten/incoming/emscripten.py", line 1852, in _main
    temp_files.run_and_clean(lambda: main(
  File "/Users/james/Documents/work/wasm/emsdk/emscripten/incoming/tools/tempfiles.py", line 78, in run_and_clean
    return func()
  File "/Users/james/Documents/work/wasm/emsdk/emscripten/incoming/emscripten.py", line 1857, in <lambda>
    DEBUG=DEBUG,
  File "/Users/james/Documents/work/wasm/emsdk/emscripten/incoming/emscripten.py", line 1758, in main
    temp_files=temp_files, DEBUG=DEBUG)
  File "/Users/james/Documents/work/wasm/emsdk/emscripten/incoming/emscripten.py", line 91, in emscript
    funcs, metadata, mem_init = get_and_parse_backend(infile, settings, temp_files, DEBUG)
  File "/Users/james/Documents/work/wasm/emsdk/emscripten/incoming/emscripten.py", line 160, in get_and_parse_backend
    backend_output = open(temp_js).read()
IOError: [Errno 2] No such file or directory: '/var/folders/yl/vyrpz7vj1zvbw7dmv_cqbksw0000gn/T/emscripten_temp/tmpG9fBXn.4.js'
make[2]: *** [myapp.js] Error 1
make[1]: *** [CMakeFiles/myapp.dir/all] Error 2
make: *** [all] Error 2
kripken commented 7 years ago

Ok, that extractvalue error looks like a limitation of the NaCl legalization passes that the asm.js backend uses. It would be interesting to know what kind of source code leads to clang emitting that, as I've never seen it before. But in any case, to fix this, we'd need to improve the ExpandStructRegs and/or PromoteIntegers passes there (ExpandStructRegs should be lowering that extractvalue into something simpler, so the first question is why it isn't).

phraemer commented 7 years ago

@kripken @dschuff I can send you copy of the bc file. Can you DM me your email address? I'm on twitter as https://twitter.com/jamestheswift

kripken commented 7 years ago

I'd need the source code to better understand what's going on here, I'm afraid - the question is why clang is generating that IR in the first place. Perhaps you can reduce it to a small source sample you can share?

phraemer commented 7 years ago

I finally figured out how to source reduce the problem. It's in code using safe_math.h from the Chromium project.

Am I wrong or should adding -s LINKABLE=1 prevent -globaldce being added to the call to opt?

I created a repo here https://github.com/phraemer/emscripten_bug

call ./compile.sh to produce fun.o and then call ./opt_llc.sh fun.o to reproduce the error.

Hope this helps!

kripken commented 7 years ago

globaldce is still safe to run, what changes is we don't internalize first. That means that it can only clean up symbols already internal (static globals in C).

Thanks for the testcase. Ok, looks like the issue is in __builtin_add_overflow, which apparently clang lowers directly into an extractvalue of an i33. Since asm.js and wasm don't have support for that anyhow, might as well disable USE_OVERFLOW_BUILTINS, which avoids the crash - there should be no downside to doing that.

phraemer commented 7 years ago

That indeed avoids the issue. Thank you all for your help and the work you are doing.

It's greatly appreciated!

phraemer commented 7 years ago

@kripken not sure if I should open a separate issue for this but when compiling for WASM and setting to upstream llvm as mentioned earlier I managed to reduce another case.

The following code produces the error Function addresses with offsets not supported

#include <boost/uuid/uuid_generators.hpp>

void bug() {
    boost::uuids::random_generator();
}
kripken commented 7 years ago

That looks like a bug in the wasm backend, so best to report it on the LLVM bug tracker, there is a "WebAssembly" component.

lnxjnky commented 4 years ago

@kripken I am using sdk-fastcomp-tag-1.38.30-64bit with WASM=0 and facing the "stack smashing " issue with similar crash on a chromium project. I am not able to find how USE_OVERFLOW_BUILTINS should be used. Is the above solution applicable for sdk-fastcomp-tag-1.38.30-64bit as well ?

%40 = extractvalue { i33, i1 } %38, 0
  %40 = extractvalue { i33, i1 } %38, 0
*** stack smashing detected ***: <unknown> terminated
#0 0x00005581652cf90a llvm::sys::PrintStackTrace(llvm::raw_ostream&) (emsdk/fastcomp-clang/tag-e1.38.30/build_tag-e1.38.30_64/bin/llc+0x108190a)
#1 0x00005581652cd826 llvm::sys::RunSignalHandlers() (emsdk/fastcomp-clang/tag-e1.38.30/build_tag-e1.38.30_64/bin/llc+0x107f826)
#2 0x00005581652cd96c SignalHandler(int) (emsdk/fastcomp-clang/tag-e1.38.30/build_tag-e1.38.30_64/bin/llc+0x107f96c)
#3 0x00007f442f3758a0 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x128a0)
#4 0x00007f442e026f47 gsignal /build/glibc-2ORdQG/glibc-2.27/signal/../sysdeps/unix/sysv/linux/raise.c:51:0
#5 0x00007f442e0288b1 abort /build/glibc-2ORdQG/glibc-2.27/stdlib/abort.c:81:0
#6 0x00007f442e071907 __libc_message /build/glibc-2ORdQG/glibc-2.27/libio/../sysdeps/posix/libc_fatal.c:181:0
#7 0x00007f442e11ce81 __GI___fortify_fail_abort /build/glibc-2ORdQG/glibc-2.27/debug/fortify_fail.c:33:0
#8 0x00007f442e11ce42 (/lib/x86_64-linux-gnu/libc.so.6+0x134e42)
#9 0x00005581654d4438 (anonymous namespace)::PromoteIntegers::runOnModule(llvm::Module&) (emsdk/fastcomp-clang/tag-e1.38.30/build_tag-e1.38.30_64/bin/llc+0x1286438)
#10 0x0000558164d63c04 llvm::legacy::PassManagerImpl::run(llvm::Module&) (emsdk/fastcomp-clang/tag-e1.38.30/build_tag-e1.38.30_64/bin/llc+0xb15c04)
#11 0x000055816452043b compileModule(char**, llvm::LLVMContext&) emsdk/fastcomp-clang/tag-e1.38.30/build_tag-e1.38.30_64/bin/llc+0x2d243b)
#12 0x00005581644d99e5 main (emsdk/fastcomp-clang/tag-e1.38.30/build_tag-e1.38.30_64/bin/llc+0x28b9e5)
#13 0x00007f442e009b97 __libc_start_main /build/glibc-2ORdQG/glibc-2.27/csu/../csu/libc-start.c:344:0
#14 0x000055816451432a _start (emsdk/fastcomp-clang/tag-e1.38.30/build_tag-e1.38.30_64/bin/llc+0x2c632a)
kripken commented 4 years ago

Probably the same solution should work, yes - find out where in the source code the i33 comes from, and avoid it being generated at all.

Inspecting the LLVM IR and finding the i33 can show which function it's in, etc.

In general, though, this is only a problem with the old fastcomp backend. I'd recommend upgrading to a recent release which uses upstream and can handle i33s.