WebAssembly / threads

Threads and Atomics in WebAssembly
https://webassembly.github.io/threads/
Other
705 stars 50 forks source link

LLVM/clang OpenMP - anyone? #186

Open ecs-deutschland-gmbh opened 2 years ago

ecs-deutschland-gmbh commented 2 years ago

Hi,

has anyone thought about it yet? So far - not even thinking of compiling libomp to wasm - even a minimal example fails in the backend, e.g.:

#pragma optimize off
#include <stddef.h>
#include <stdint.h>
#include <stdlib.h>
#include "anet_intrinsic.h"
//#include <omp.h>

int main() {
    #pragma omp parallel num_threads(4)
    {
        #pragma omp critical
        abs(-5555);
    }
}
>>>
C:\LLVM\BUILDLLDB\Debug\bin\clang++ -fopenmp -DDEBUG -mbulk-memory -mmutable-globals -O0 -I.. -I..\.. -I. -Wl,--max-memory=327680 -Wl,--shared-memory -matomics -target wasm32-wasi --sysroot /project/wasi-libc/sysroot-intrin-bulk-memory -Wl,--export-all  -Wl,-L.  -Wl,--entry=__main_void -Wl,--allow-undefined -o omp_dummy.cc.wasm omp_dummy.cc
Common symbols are not yet implemented for Wasm
UNREACHABLE executed at C:\LLVM\llvm\lib\MC\MCWasmStreamer.cpp:164!
PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.      Program arguments: C:\\LLVM\\BUILDLLDB\\Debug\\bin\\clang++.exe -cc1 -triple wasm32-unknown-wasi -emit-obj -mrelax-all --mrelax-relocations -disable-free -clear-ast-before-backend -main-file-name omp_dummy.cc -mrelocation-model static -mframe-pointer=none -ffp-contract=on -fno-rounding-math -mconstructor-aliases -target-cpu generic -target-feature +bulk-memory -target-feature +mutable-globals -target-feature +atomics -fvisibility hidden -debugger-tuning=gdb -fcoverage-compilation-dir=C:\\project\\ALP\\WASM\\exc\\exc_newproposal -resource-dir C:\\LLVM\\BUILDLLDB\\Debug\\lib\\clang\\14.0.0 -D DEBUG -I .. -I ..\\.. -I . -isysroot /project/wasi-libc/sysroot-intrin-bulk-memory -internal-isystem /project/wasi-libc/sysroot-intrin-bulk-memory/include/wasm32-wasi/c++/v1 -internal-isystem /project/wasi-libc/sysroot-intrin-bulk-memory/include/c++/v1 -internal-isystem C:\\LLVM\\BUILDLLDB\\Debug\\lib\\clang\\14.0.0\\include -internal-isystem /project/wasi-libc/sysroot-intrin-bulk-memory/include/wasm32-wasi -internal-isystem /project/wasi-libc/sysroot-intrin-bulk-memory/include -O0 -fdeprecated-macro -fdebug-compilation-dir=C:\\project\\ALP\\WASM\\exc\\exc_newproposal -ferror-limit 19 -fmessage-length=120 -fopenmp -fgnuc-version=4.2.1 -fcxx-exceptions -fexceptions -fcolor-diagnostics -o C:\\Users\\I\\AppData\\Local\\Temp\\omp_dummy-17eebb.o -x c++ omp_dummy.cc
1.      <eof> parser at end of file
2.      Code generation
 #0 0x00007ff7b0397c2c HandleAbort C:\LLVM\llvm\lib\Support\Windows\Signals.inc:408:0
 #1 0x00007ffc24a18e05 (C:\Windows\SYSTEM32\ucrtbased.dll+0xa8e05)
 #2 0x00007ffc24a1ab29 (C:\Windows\SYSTEM32\ucrtbased.dll+0xaab29)
 #3 0x00007ff7b036d21f llvm::llvm_unreachable_internal(char const *, char const *, unsigned int) C:\LLVM\llvm\lib\Support\ErrorHandling.cpp:215:0
 #4 0x00007ff7afca1ed2 llvm::MCWasmStreamer::emitCommonSymbol(class llvm::MCSymbol *, unsigned __int64, unsigned int) C:\LLVM\llvm\lib\MC\MCWasmStreamer.cpp:165:0
 #5 0x00007ff7b207ad3c llvm::AsmPrinter::emitGlobalVariable(class llvm::GlobalVariable const *) C:\LLVM\llvm\lib\CodeGen\AsmPrinter\AsmPrinter.cpp:580:0
 #6 0x00007ff7ada22e9e llvm::WebAssemblyAsmPrinter::emitGlobalVariable(class llvm::GlobalVariable const *) C:\LLVM\llvm\lib\Target\WebAssembly\WebAssemblyAsmPrinter.cpp:177:0
 #7 0x00007ff7b20745b7 llvm::AsmPrinter::doFinalization(class llvm::Module &) C:\LLVM\llvm\lib\CodeGen\AsmPrinter\AsmPrinter.cpp:1737:0
 #8 0x00007ff7af174176 llvm::FPPassManager::doFinalization(class llvm::Module &) C:\LLVM\llvm\lib\IR\LegacyPassManager.cpp:1503:0
 #9 0x00007ff7af175852 `anonymous namespace'::MPPassManager::runOnModule C:\LLVM\llvm\lib\IR\LegacyPassManager.cpp:1590:0
#10 0x00007ff7af176375 llvm::legacy::PassManagerImpl::run(class llvm::Module &) C:\LLVM\llvm\lib\IR\LegacyPassManager.cpp:542:0
#11 0x00007ff7af16e292 llvm::legacy::PassManager::run(class llvm::Module &) C:\LLVM\llvm\lib\IR\LegacyPassManager.cpp:1682:0
#12 0x00007ff7b0cf0d01 `anonymous namespace'::EmitAssemblyHelper::RunCodegenPipeline C:\LLVM\clang\lib\CodeGen\BackendUtil.cpp:1520:0
#13 0x00007ff7b0cf1f38 `anonymous namespace'::EmitAssemblyHelper::EmitAssembly C:\LLVM\clang\lib\CodeGen\BackendUtil.cpp:1550:0
#14 0x00007ff7b0cece03 clang::EmitBackendOutput(class clang::DiagnosticsEngine &, class clang::HeaderSearchOptions const &, class clang::CodeGenOptions const &, class clang::TargetOptions const &, class clang::LangOptions const &, class llvm::StringRef, class llvm::Module *, enum clang::BackendAction, class std::unique_ptr<class llvm::raw_pwrite_stream, struct std::default_delete<class llvm::raw_pwrite_stream>>) C:\LLVM\clang\lib\CodeGen\BackendUtil.cpp:1714:0
#15 0x00007ff7b7aeead1 clang::BackendConsumer::HandleTranslationUnit(class clang::ASTContext &) C:\LLVM\clang\lib\CodeGen\CodeGenAction.cpp:374:0
#16 0x00007ff7b4b5f538 clang::ParseAST(class clang::Sema &, bool, bool) C:\LLVM\clang\lib\Parse\ParseAST.cpp:178:0
#17 0x00007ff7b1976a07 clang::ASTFrontendAction::ExecuteAction(void) C:\LLVM\clang\lib\Frontend\FrontendAction.cpp:1076:0
#18 0x00007ff7b7ae24e7 clang::CodeGenAction::ExecuteAction(void) C:\LLVM\clang\lib\CodeGen\CodeGenAction.cpp:1108:0
#19 0x00007ff7b19763be clang::FrontendAction::Execute(void) C:\LLVM\clang\lib\Frontend\FrontendAction.cpp:971:0
#20 0x00007ff7b18f5106 clang::CompilerInstance::ExecuteAction(class clang::FrontendAction &) C:\LLVM\clang\lib\Frontend\CompilerInstance.cpp:1030:0
#21 0x00007ff7b1b62648 clang::ExecuteCompilerInvocation(class clang::CompilerInstance *) C:\LLVM\clang\lib\FrontendTool\ExecuteCompilerInvocation.cpp:261:0
#22 0x00007ff7ab8becf4 cc1_main(class llvm::ArrayRef<char const *>, char const *, void *) C:\LLVM\clang\tools\driver\cc1_main.cpp:246:0
#23 0x00007ff7ab8aa410 ExecuteCC1Tool C:\LLVM\clang\tools\driver\driver.cpp:317:0

Can anyone oversee what must be implemented to achieve this?

Thanks, S.

sbc100 commented 2 years ago

This is really an llvm/clang issue, so perhaps better to file it there.

The issue that we don't support "common" symbols currently (Common symbols are not yet implemented for Wasm) and I guess OpenMP is using them? I don't think we have any plans to add support for common symbols.

sunfishcode commented 2 years ago

For the immediate error here, "Common symbols are not yet implemented for Wasm", ideally we should figure out why clang is using common symbols for OpenMP and whether that can be changed.

But beyond that, for OpenMP to be actually usable, it will need support for threads too. Since you're using a WASI target, threads for non-Web wasm environments where Workers aren't available is not yet designed; the following issues have some discussion:

WebAssembly/threads#8 WebAssembly/threads#95 WebAssembly/threads#138

ecs-deutschland-gmbh commented 2 years ago

Thanks. I did indeed open issue 52714 @llvm-project.

I think, in a first step, after being able to compile something with #pragma openmp, an embedder could use an own approach for libomp imports (kmp_* functions). This would also include creation and management of the worker threads.

Everything else related to atomics should be fine.

However, a big obstacle COULD be if the omp compilation output tries to do something to the stack directly, which of course could be the case for the "more normal" targets, x86 etc.

Regards, S.

penzn commented 2 years ago

I've done something like that before, though it is in very rough POC stage and is blocked on runtime support.

@ecs-commonA you are absolutely right, if 'kmp' symbols are defined OpenMP program should build. The tricky part is defining them to being something meaningful. Normally those are defined by openmp runtime (-DLLVM_ENABLE_RUNTIMES=openmp in LLVM build) and represent an API to fork off (outlined) worker functions. For running in the browser it should be possible to export a wrapper around WebWorkers to spawn off openmp 'threads' from the fork function, this is already done to compile some threaded code. For standalone it isn't obvious what to do yet.

However, a big obstacle COULD be if the omp compilation output tries to do something to the stack directly, which of course could be the case for the "more normal" targets, x86 etc.

I am not sure that would be a problem. OpenMP data is passed via heap pointers, and ident structures that guide execution are also heap allocated, I think. I think use of common symbols for critical sections is a bigger issue.

Some further pointers. It would be necessary to get some handle on OpenMP data structures - they are produced by Clang, it would be important to have compatible structures it in Wasm. If you are interested in WebWorkers possible gotchas include inability to suspend main thread and lack of explicit join.

Are you interesting in implementing an OpenMP prototype using WebWorkers (Web/JS usecase)? I can try to help. Standalone might be possible in the future (see issues linked above).