emscripten-core / emscripten

Emscripten: An LLVM-to-WebAssembly Compiler
Other
25.81k stars 3.31k forks source link

Invalid DWARF wasm location with split-dwarf #13240

Closed pfaffe closed 3 years ago

pfaffe commented 3 years ago

When compiling with split-dwarf, Emscripten emits broken locations for wasm globals.

Reproducer:

int main(int argc, char** argv) {
  return argc;
}

Compile this snippet with emcc -O1 -g -c and with emcc -O1 -g -c -gsplit-dwarf. In the first case, llvm-dwarfdump shows the correct DW_AT_frame_base (DW_OP_WASM_location 0x3 0x0, DW_OP_stack_value).

In the second case, the output is: DW_AT_frame_base (DW_OP_WASM_location 0x3 0x0, <decoding error> 00 9f)

In that case, the location is encoded as 0xed 0x03 0x0 0x9f even though the second argument to WASM_location should be a DW_FORM_data4, i.e., a uint32, but it's encoded as a ULEB here. In the first case, emscripten gets it right.

kripken commented 3 years ago

Does it work with -s WASM_BIGINT? That would disable binaryen processing of the wasm, which I think we will require for split dwarf for correctness - we've decided that it is better to focus on avoiding the binaryen processing in that case, and all the work for that should be done, except that the user must not use features that force binaryen to be used (like -O2+ or not using bigints).

We should probably make -gsplit-dwarf error if binaryen runs, to avoid any confusion?

pfaffe commented 3 years ago

Does binaryen run with -c?

kripken commented 3 years ago

No. Binaryen only runs (optionally) during the link stage.

pfaffe commented 3 years ago

The problem appears before binaryen then. My suspicion is that clang goes through a different codegen path for dwo files and @aardappel's reloc change is missing there.

kripken commented 3 years ago

Sorry, I missed the -c before... yes, looks like a clang issue then.

aardappel commented 3 years ago

My reloc change for reference: https://reviews.llvm.org/D77353

Writing this data happens during DwarfCompileUnit::updateSubprogramScopeDIE, which should be before the Wasm object writing where the split DWARF distinction arises? @dschuff any idea how this can interfere?

pfaffe commented 3 years ago

Should there be a check here whether the index is 0x3: https://github.com/llvm/llvm-project/blob/55f2eeebc96e7522e49e19074cbfbe4e7f074b5b/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp#L2493

aardappel commented 3 years ago

Maybe an assert yes, as that code is only intended for values other than 3. Is split-dwarf causing these exps to be re-encoded, such that they go thru this generic path as opposed to the special purpose code in DwarfCompileUnit::updateSubprogramScopeDIE ?

aardappel commented 3 years ago

Does not repro using LLVM directly as such:

clang -O1 -g -c -gsplit-dwarf mini.c -S -emit-llvm --target=wasm32-unknown-unknown llc -filetype=obj mini.ll --mtriple=wasm32 llvm-dwarfdump -v mini.o

The only way the .ll file appears to honor the -gsplit-dwarf parameter is with a splitDebugFilename: "mini.dwo" attribute, I assume that is sufficient. Output has DW_AT_frame_base [DW_FORM_exprloc] (DW_OP_WASM_location 0x3 0x0, DW_OP_stack_value) as expected.

aardappel commented 3 years ago

This is using LLVM from main, @pfaffe is your LLVM a very recent build?

pfaffe commented 3 years ago

I just reproduced this with emcc tot:

0x0000000b: DW_TAG_compile_unit                                                                                                                       [15/175]
              DW_AT_producer    ("clang version 13.0.0 (/b/s/w/ir/cache/git/chromium.googlesource.com-external-github.com-llvm-llvm--project 6ff09ce061df499b5
18150f33006d1b0dcfdabd7)")                                                                                                                                    
              DW_AT_language    (DW_LANG_C99)                                                                                                                 
              DW_AT_name        ("mini.c")                                                                                                                    
              DW_AT_GNU_dwo_name        ("mini.dwo")                                                                                                          
              DW_AT_GNU_dwo_id  (0x378a70967e97e11f)                                                                                                          

0x00000019:   DW_TAG_subprogram                                                                                                                               
                DW_AT_low_pc    (indexed (00000000) address = <unresolved>)                                                                                   
                DW_AT_high_pc   (0x00000004)                                                                                                                  
                DW_AT_frame_base        (DW_OP_WASM_location 0x3 0x0, <decoding error> 00 9f)                                                                 
                DW_AT_GNU_all_call_sites        (true)                                                                                                        
                DW_AT_name      ("main")                                                                                                                      
                DW_AT_decl_file (0x01)                                                                                                                        
                DW_AT_decl_line (1)                                                                                                                           
                DW_AT_prototyped        (true)
                DW_AT_type      (0x00000041 "int")
pfaffe commented 3 years ago

Verified the fix with 2.0.15, thanks!