Open derek-gerstmann opened 1 year ago
...I don't remember :-)
Try changing it and see what breaks?
Okay, will do ... lemme see if things go boom.
Just enabling the flag doesn't seem to be enough to enable relocation ... the LLVM module gets created with the correct halide_use_pic flag, and PIC level flags, but I still get errors emitted from wasm-ld:
wasm-ld: error: relocation R_WASM_MEMORY_ADDR_SLEB cannot be used against symbol
.Lstr; recompile with -fPIC
There may be some other LLVM feature flags that may need to get set to make this work. I'll keep investigating.
This seems relevant: https://bugs.llvm.org/show_bug.cgi?id=42714
Which would imply that the triple flag needs to have the os
set to emscripten
to allow dynamic linking.
I believe that bug was fixed in https://reviews.llvm.org/D153293. I just closes the corresponding github issue: https://github.com/llvm/llvm-project/issues/42059
Note that if you want you could unconditionally enable PIC code generation. It does at a little bloat to the object files, due to the extra indirection, but when linked into a static binary 100% of this can be removed by wasm-opt
in release builds.
Note that if you want you could unconditionally enable PIC code generation. It does at a little bloat to the object files, due to the extra indirection, but when linked into a static binary 100% of this can be removed by
wasm-opt
in release builds.
That sounds like the simplest option, TBH -- is there a reason this isn't done in general? (Is wasm-opt
not usually in use, or is the bloat more than "a little"?)
(Thanks for commenting, BTW, didn't realize you were watching these issues/PRs :-)
I was pointed here by an emscripten who uses halide
Maybe you can just remove the bool use_pic() const override;
override and take the PIC setting from the command line?
Here's an idea: could we just always set PIC everywhere, and then have a Halide target flag (e.g., -no_pic
) to disable it?
PIC setting from the command line?
It not quite that simple, alas, since Halide codegen isn't always (or even usually) directly invoked from the command line, but rather, indirectly from build rules.
Here's an idea: could we just always set PIC everywhere, and then have a Halide target flag (e.g., -no_pic) to disable it?
That would probably be a much more tractable approach. It looks like we (currently) default to using PIC for all LLVM-based backends, with no way to override them; WebAssembly is the only backend that changes this (to false, also with no way to override it.)
So, strawman:
no_pic
Target Feature that flips this setting (for all all LLVM backends, because, why not?)Thanks very much for the reply! It appears just setting use_pic() isn’t enough … we still get linking errors when trying to use the static libraries or object files create by Halide in a shared object created by emcc.
In the link you provided it appears that the target triple includes wasm32-wasi … is wasi required for dynamic linking?
we still get linking errors
OK, we should clearly hold off until we know the entire scope of the issue. What are the linking errors?
I believe that bug was fixed in https://reviews.llvm.org/D153293. I just closes the corresponding github issue: llvm/llvm-project#42059
A bit of a threadjack here, but: it's been a minute since I've done more than minor fixes to our wasm backend... which was written when WASI was still mostly just a proposal. Looking at the link, it seems likely that upgrading our codegen to be WASI-compliant (if needed) would be highly desirable (to allow for easier use in a runtime env that doesn't include JS support)... and I don't know anything at all about the Component Model proposal.
Unfortunately, our team is fairly understaffed right now, and there's ~no way that I'm likely to get any bandwidth to look into this anytime soon (unless my manager(s) make it a priority for me), but the Halide team would absolutely welcome any community contribution that could be made here.
we still get linking errors
OK, we should clearly hold off until we know the entire scope of the issue. What are the linking errors?
wasm-ld: error: relocation R_WASM_MEMORY_ADDR_SLEB cannot be used against symbol.Lstr; recompile with -fPIC
@steven-johnson. Note that both wasi and the component model as still very early in the standard process. Its likely worth supporting the different target triples but I think that is all you would need to do. I would imagine you could even target wasm32-unknown-unknown
and you object files should work with both wasi-sdk and emscripten (I think there was one ABI difference remaining, but there are efforts to remove that too).
I attempted to override the llvm::triple inside of LLVM_Runtime_Linker.cpp:
I tried wasm32-unknown-unkown
and wasm32-unknown-wasi
, but this didn't seem to help. Is there a different format necessary for wasm static libraries or object files that's needed?
FWIW, I just stumbled upon this (from 2019): https://iandouglasscott.com/2019/07/18/experimenting-with-webassembly-dynamic-linking-with-clang/
I attempted to override the llvm::triple inside of LLVM_Runtime_Linker.cpp:
I tried
wasm32-unknown-unkown
andwasm32-unknown-wasi
, but this didn't seem to help. Is there a different format necessary for wasm static libraries or object files that's needed?
Not a different format, no. But the relocation types used are different. In PIC code generatation you should shouldn't get any R_WASM_MEMORY_ADDR_SLEB
relocations being generated. Hence the error message telling you to recompile with -fPIC
. So I guess that halide compiler is doing something a little different to the clang here. I don't think it will be hard for someone with some knowledge of hadlide to track down and fix.
Okay, good to know. It looks like the generated object files can be inspected with wabt. From the HelloWasm app:
> wasm-objdump -x ./bin/wasm/reaction_diffusion_init.o
reaction_diffusion_init.o: file format wasm 0x1
Section Details:
Type[6]:
- type[0] (i32) -> i32
- type[1] (i32, i32) -> i32
- type[2] (i32, i32, i32, i32) -> i32
- type[3] (i32, i32, i32, i32, i32) -> i32
- type[4] (i32, i32, i64, i64) -> i32
- type[5] () -> i32
Import[11]:
- memory[0] pages: initial=1 <- env.__linear_memory
- global[0] i32 mutable=1 <- env.__stack_pointer
- func[0] sig=1 <env.halide_error_buffer_argument_is_null> <- env.halide_error_buffer_argument_is_null
- func[1] sig=2 <env.halide_error_bad_type> <- env.halide_error_bad_type
- func[2] sig=2 <env.halide_error_bad_dimensions> <- env.halide_error_bad_dimensions
- func[3] sig=2 <env.halide_error_buffer_extents_negative> <- env.halide_error_buffer_extents_negative
- func[4] sig=3 <env.halide_error_constraint_violated> <- env.halide_error_constraint_violated
- func[5] sig=4 <env.halide_error_buffer_allocation_too_large> <- env.halide_error_buffer_allocation_too_large
- func[6] sig=4 <env.halide_error_buffer_extents_too_large> <- env.halide_error_buffer_extents_too_large
- func[7] sig=1 <env.halide_error_device_dirty_with_no_device_support> <- env.halide_error_device_dirty_with_no_device_support
- func[8] sig=1 <env.halide_error_host_is_null> <- env.halide_error_host_is_null
Function[3]:
- func[9] sig=0 <reaction_diffusion_init>
- func[10] sig=0 <reaction_diffusion_init_argv>
- func[11] sig=5 <reaction_diffusion_init_metadata>
DataCount:
- data count: 9
Code[3]:
- func[9] size=1697 <reaction_diffusion_init>
- func[10] size=13 <reaction_diffusion_init_argv>
- func[11] size=8 <reaction_diffusion_init_metadata>
Data[9]:
- segment[0] <.rodata..L__unnamed_1> memory=0 size=24 - init i32=0
- 0000000: 0000 0000 0000 0000 0000 0000 0000 0000 ................
- 0000010: 0000 0000 0000 0000 ........
- segment[1] <.rodata..Lstr> memory=0 size=7 - init i32=32
- 0000020: 6f75 7470 7574 00 output.
- segment[2] <.rodata..L__unnamed_2> memory=0 size=36 - init i32=48
- 0000030: 2000 0000 0200 0000 0300 0000 0220 0100 ............ ..
- 0000040: 0000 0000 0000 0000 0000 0000 0000 0000 ................
- 0000050: 0000 0000 ....
- segment[3] <.rodata..Lstr.4> memory=0 size=26 - init i32=96
- 0000060: 7761 736d 2d33 322d 7761 736d 7274 2d6e wasm-32-wasmrt-n
- 0000070: 6f5f 7275 6e74 696d 6500 o_runtime.
- segment[4] <.rodata..Lstr.5> memory=0 size=24 - init i32=128
- 0000080: 7265 6163 7469 6f6e 5f64 6966 6675 7369 reaction_diffusi
- 0000090: 6f6e 5f69 6e69 7400 on_init.
- segment[5] <.rodata..Lreaction_diffusion_init_metadata_storage> memory=0 size=20 - init i32=160
- 00000a0: 0100 0000 0100 0000 3000 0000 6000 0000 ........0...`...
- 00000b0: 8000 0000 ....
- segment[6] <.rodata..Lstr.6> memory=0 size=21 - init i32=192
- 00000c0: 4f75 7470 7574 2062 7566 6665 7220 6f75 Output buffer ou
- 00000d0: 7470 7574 00 tput.
- segment[7] <.rodata..Lstr.7> memory=0 size=16 - init i32=224
- 00000e0: 6f75 7470 7574 2e73 7472 6964 652e 3000 output.stride.0.
- segment[8] <.rodata..Lstr.8> memory=0 size=2 - init i32=256
- 0000100: 3100 1.
Custom:
- name: "linking"
- symbol table [count=22]
- 0: F <reaction_diffusion_init> func=9 [ binding=global vis=default ]
- 1: G <env.__stack_pointer> global=0 [ undefined binding=global vis=default ]
- 2: D <.Lstr> segment=1 offset=0 size=7 [ binding=local vis=default ]
- 3: F <env.halide_error_buffer_argument_is_null> func=0 [ undefined binding=global vis=default ]
- 4: D <.Lstr.6> segment=6 offset=0 size=21 [ binding=local vis=default ]
- 5: F <env.halide_error_bad_type> func=1 [ undefined binding=global vis=default ]
- 6: F <env.halide_error_bad_dimensions> func=2 [ undefined binding=global vis=default ]
- 7: F <env.halide_error_buffer_extents_negative> func=3 [ undefined binding=global vis=default ]
- 8: D <.Lstr.7> segment=7 offset=0 size=16 [ binding=local vis=default ]
- 9: D <.Lstr.8> segment=8 offset=0 size=2 [ binding=local vis=default ]
- 10: F <env.halide_error_constraint_violated> func=4 [ undefined binding=global vis=default ]
- 11: F <env.halide_error_buffer_allocation_too_large> func=5 [ undefined binding=global vis=default ]
- 12: F <env.halide_error_buffer_extents_too_large> func=6 [ undefined binding=global vis=default ]
- 13: F <env.halide_error_device_dirty_with_no_device_support> func=7 [ undefined binding=global vis=default ]
- 14: F <env.halide_error_host_is_null> func=8 [ undefined binding=global vis=default ]
- 15: F <reaction_diffusion_init_argv> func=10 [ binding=global vis=default ]
- 16: F <reaction_diffusion_init_metadata> func=11 [ binding=global vis=default ]
- 17: D <.Lreaction_diffusion_init_metadata_storage> segment=5 offset=0 size=20 [ binding=local vis=default ]
- 18: D <.L__unnamed_1> segment=0 offset=0 size=24 [ binding=local vis=default ]
- 19: D <.L__unnamed_2> segment=2 offset=0 size=36 [ binding=local vis=default ]
- 20: D <.Lstr.4> segment=3 offset=0 size=26 [ binding=local vis=default ]
- 21: D <.Lstr.5> segment=4 offset=0 size=24 [ binding=local vis=default ]
- segment info [count=9]
- 0: .rodata..L__unnamed_1 p2align=4 [ ]
- 1: .rodata..Lstr p2align=5 [ ]
- 2: .rodata..L__unnamed_2 p2align=4 [ ]
- 3: .rodata..Lstr.4 p2align=5 [ ]
- 4: .rodata..Lstr.5 p2align=5 [ ]
- 5: .rodata..Lreaction_diffusion_init_metadata_storage p2align=4 [ ]
- 6: .rodata..Lstr.6 p2align=5 [ ]
- 7: .rodata..Lstr.7 p2align=5 [ ]
- 8: .rodata..Lstr.8 p2align=5 [ ]
Custom:
- name: "reloc.CODE"
- relocations for section: 4 (Code) [34]
- R_WASM_GLOBAL_INDEX_LEB offset=0x00000f(file=0x0001ff) symbol=1 <env.__stack_pointer>
- R_WASM_GLOBAL_INDEX_LEB offset=0x00001a(file=0x00020a) symbol=1 <env.__stack_pointer>
- R_WASM_MEMORY_ADDR_SLEB offset=0x00002a(file=0x00021a) symbol=2 <.Lstr>
- R_WASM_FUNCTION_INDEX_LEB offset=0x000030(file=0x000220) symbol=3 <env.halide_error_buffer_argument_is_null>
- R_WASM_MEMORY_ADDR_SLEB offset=0x000298(file=0x000488) symbol=4 <.Lstr.6>
- R_WASM_FUNCTION_INDEX_LEB offset=0x0002a4(file=0x000494) symbol=5 <env.halide_error_bad_type>
- R_WASM_MEMORY_ADDR_SLEB offset=0x0002b1(file=0x0004a1) symbol=4 <.Lstr.6>
- R_WASM_FUNCTION_INDEX_LEB offset=0x0002bb(file=0x0004ab) symbol=6 <env.halide_error_bad_dimensions>
- R_WASM_MEMORY_ADDR_SLEB offset=0x0002c8(file=0x0004b8) symbol=4 <.Lstr.6>
- R_WASM_FUNCTION_INDEX_LEB offset=0x0002d2(file=0x0004c2) symbol=7 <env.halide_error_buffer_extents_negative>
- R_WASM_MEMORY_ADDR_SLEB offset=0x0002df(file=0x0004cf) symbol=4 <.Lstr.6>
- R_WASM_FUNCTION_INDEX_LEB offset=0x0002e9(file=0x0004d9) symbol=7 <env.halide_error_buffer_extents_negative>
- R_WASM_MEMORY_ADDR_SLEB offset=0x0002f6(file=0x0004e6) symbol=4 <.Lstr.6>
- R_WASM_FUNCTION_INDEX_LEB offset=0x000300(file=0x0004f0) symbol=7 <env.halide_error_buffer_extents_negative>
- R_WASM_MEMORY_ADDR_SLEB offset=0x00030d(file=0x0004fd) symbol=8 <.Lstr.7>
- R_WASM_MEMORY_ADDR_SLEB offset=0x000315(file=0x000505) symbol=9 <.Lstr.8>
- R_WASM_FUNCTION_INDEX_LEB offset=0x00031d(file=0x00050d) symbol=10 <env.halide_error_constraint_violated>
- R_WASM_MEMORY_ADDR_SLEB offset=0x00032a(file=0x00051a) symbol=2 <.Lstr>
- R_WASM_FUNCTION_INDEX_LEB offset=0x000338(file=0x000528) symbol=11 <env.halide_error_buffer_allocation_too_large>
- R_WASM_MEMORY_ADDR_SLEB offset=0x000345(file=0x000535) symbol=2 <.Lstr>
- R_WASM_FUNCTION_INDEX_LEB offset=0x000353(file=0x000543) symbol=11 <env.halide_error_buffer_allocation_too_large>
- R_WASM_MEMORY_ADDR_SLEB offset=0x000360(file=0x000550) symbol=2 <.Lstr>
- R_WASM_FUNCTION_INDEX_LEB offset=0x00036e(file=0x00055e) symbol=12 <env.halide_error_buffer_extents_too_large>
- R_WASM_MEMORY_ADDR_SLEB offset=0x00037b(file=0x00056b) symbol=2 <.Lstr>
- R_WASM_FUNCTION_INDEX_LEB offset=0x000389(file=0x000579) symbol=11 <env.halide_error_buffer_allocation_too_large>
- R_WASM_MEMORY_ADDR_SLEB offset=0x000396(file=0x000586) symbol=2 <.Lstr>
- R_WASM_FUNCTION_INDEX_LEB offset=0x0003a4(file=0x000594) symbol=12 <env.halide_error_buffer_extents_too_large>
- R_WASM_MEMORY_ADDR_SLEB offset=0x0003b1(file=0x0005a1) symbol=4 <.Lstr.6>
- R_WASM_FUNCTION_INDEX_LEB offset=0x0003b7(file=0x0005a7) symbol=13 <env.halide_error_device_dirty_with_no_device_support>
- R_WASM_MEMORY_ADDR_SLEB offset=0x0003c4(file=0x0005b4) symbol=4 <.Lstr.6>
- R_WASM_FUNCTION_INDEX_LEB offset=0x0003ca(file=0x0005ba) symbol=14 <env.halide_error_host_is_null>
- R_WASM_GLOBAL_INDEX_LEB offset=0x00069c(file=0x00088c) symbol=1 <env.__stack_pointer>
- R_WASM_FUNCTION_INDEX_LEB offset=0x0006ac(file=0x00089c) symbol=0 <reaction_diffusion_init>
- R_WASM_MEMORY_ADDR_SLEB offset=0x0006b5(file=0x0008a5) symbol=17 <.Lreaction_diffusion_init_metadata_storage>
Custom:
- name: "reloc.DATA"
- relocations for section: 5 (Data) [5]
- R_WASM_MEMORY_ADDR_I32 offset=0x00002f(file=0x0008e0) symbol=2 <.Lstr>
- R_WASM_MEMORY_ADDR_I32 offset=0x00004f(file=0x000900) symbol=18 <.L__unnamed_1>
- R_WASM_MEMORY_ADDR_I32 offset=0x00009f(file=0x000950) symbol=19 <.L__unnamed_2>
- R_WASM_MEMORY_ADDR_I32 offset=0x0000a3(file=0x000954) symbol=20 <.Lstr.4>
- R_WASM_MEMORY_ADDR_I32 offset=0x0000a7(file=0x000958) symbol=21 <.Lstr.5>
Custom:
- name: "producers"
As @sbc100 indicated, the above object file contains R_WASM_MEMORY_ADDR_SLEB entries which shouldn't exist if PIC was enabled. I've scrutinized the WebAssembly clang driver and noticed that "mutable-globals" appear to be required for the linker:
So I added this to Halide's CodeGen_WebAssembly::mattrs() so that emits "+mutable-globals" and verified this shows up in the generated object files:
Custom:
- name: "target_features"
- [+] mutable-globals
But the linker still complains and we still have R_WASM_MEMORY_ADDR_SLEB symbols being generated.
I'll keep digging ....
I haven't found anything concrete ... my guess is we may be missing one or more LLVM passes that may transform the module into a PIC compatible layout for wasm and/or are missing attributes on global values or functions (ie dso_local
) that would allow them to get stored correctly.
We don't use the standard llvm PassBuilder::buildModuleOptimizationPipeline()
optimizations so things like RelLookupTableConverterPass
don't get run.
Anyone have any other ideas? @steven-johnson @abadams @sbc100
The relevant code is in https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp.
You can see that LowerGlobalAddress
produces different code if isPositionIndependent
is true. You might want to check that isPositionIndependent
is returning true here?
@derek-gerstmann, also are you using llvm tip-of-tree? Because https://github.com/llvm/llvm-project/issues/42059 only recently landed.
Ah, thanks! So far I've been testing with LLVM releases 15-17. I'll pull llvm from main and rebuild.
Okay, fix confirmed. Testing against LLVM main (18.x), with the following changes to Halide enabled me to produce wasm object files that could be relocated with emcc:
@@ -362,7 +370,7 @@ bool CodeGen_WebAssembly::use_soft_float_abi() const {
}
bool CodeGen_WebAssembly::use_pic() const {
- return false;
+ return true;
}
+++ b/src/CodeGen_WebAssembly.cpp
@@ -344,6 +344,14 @@ string CodeGen_WebAssembly::mattrs() const {
sep = ",";
}
+ // PIC implies +mutable-globals because the PIC ABI used by the linker
+ // depends on importing and exporting mutable globals. Also -pthread implies
+ // mutable-globals too, so quitely enable it if either of these are specified.
+ if(use_pic() || target.has_feature(Target::WasmThreads)) {
+ s << sep << "+mutable-globals";
+ sep = ",";
+ }
+
Optional, but likely needed (based on comments from LLVM devs):
@@ -1151,6 +1156,16 @@ void CodeGen_LLVM::optimize_module() {
using OptimizationLevel = llvm::OptimizationLevel;
OptimizationLevel level = OptimizationLevel::O3;
+#if LLVM_VERSION >= 180
+ if(tm->isPositionIndependent()) {
+ pb.registerOptimizerLastEPCallback(
+ [&](ModulePassManager &mpm, OptimizationLevel level) {
+ mpm.addPass(RelLookupTableConverterPass());
+ });
+ }
+#endif
+
Symbols in the wasm object files now appear as R_WASM_MEMORY_ADDR_REL_SLEB
rather than absolute offsets via R_WASM_MEMORY_ADDR_SLEB
.
I'll open a PR, and we can iron out the details of the proposed strawman. Thanks everyone! @slomp @sbc100 @steven-johnson
It appears that the WebAssembly CodeGen explicitly disables PIC, which means that objects and modules being generated for wasm cannot be linked into a shared object and dynamically loaded (eg
emcc -s SIDE_MODULE=2
).Is this necessary?