ziglang / zig

General-purpose programming language and toolchain for maintaining robust, optimal, and reusable software.
https://ziglang.org
MIT License
34.28k stars 2.51k forks source link

problematic debug info scope structure #2418

Closed LemonBoy closed 5 years ago

LemonBoy commented 5 years ago

The usual data structure for holding information about the content of a scope is something that's

Instead the cpp compiler (and I guess also the zig version) use a bunch of different scope types that may or may not have the features mentioned above: only ScopeDecls is able to hold a list of declarations while every variable declaration opens a new scope of its own.

The following snippet tries to explain a bit what happens:

    {
        var x = i32(12); // parent_scope->id = ScopeIdBlock
        var y = i32(12); // parent_scope->id = ScopeIdVarDecl
        var z = i32(12); // parent_scope->id = ScopeIdVarDecl
    }

Even though this structure works most of the times it ends up producing wrong debug information: the Matryoshka of variables is translated into a set of DW_TAG_formal_parameter and DW_TAG_variable preceded by a DW_TAG_lexical_block.

First of all the debuggers expect a series of DW_TAG_formal_parameter to follow a function definition, be it DW_TAG_subprogram or other tags. If you're wondering why gdb (or lldb) don't print any parameter value beside the first one now you have your explanation.

The mis-generated DW_TAG_variable entries are slightly less harmful as they just let the debugger do more work, where the magnitude of "more" is roughly linear wrt the number of locals in the scope. Nonetheless the compiler is telling blatant lies about the variables scoping and that may cause other problems in the long run.

For the sake of comparison here's what clang produces for a similar snippet as the one shown above:

clang-generated debug info ``` <2><3f>: Abbrev Number: 3 (DW_TAG_lexical_block) <40> DW_AT_low_pc : 0x401114 <48> DW_AT_high_pc : 0x15 <3><4c>: Abbrev Number: 4 (DW_TAG_variable) <4d> DW_AT_location : 2 byte block: 91 7c (DW_OP_fbreg: -4) <50> DW_AT_name : (indirect string, offset: 0x7b): x <54> DW_AT_decl_file : 1 <55> DW_AT_decl_line : 3 <56> DW_AT_type : <0x78> <3><5a>: Abbrev Number: 4 (DW_TAG_variable) <5b> DW_AT_location : 2 byte block: 91 78 (DW_OP_fbreg: -8) <5e> DW_AT_name : (indirect string, offset: 0x81): y <62> DW_AT_decl_file : 1 <63> DW_AT_decl_line : 4 <64> DW_AT_type : <0x78> <3><68>: Abbrev Number: 4 (DW_TAG_variable) <69> DW_AT_location : 2 byte block: 91 74 (DW_OP_fbreg: -12) <6c> DW_AT_name : (indirect string, offset: 0x83): z <70> DW_AT_decl_file : 1 <71> DW_AT_decl_line : 5 <72> DW_AT_type : <0x78> <3><76>: Abbrev Number: 0 ```
zig-generated debug info ``` <4>: Abbrev Number: 14 (DW_TAG_lexical_block) DW_AT_low_pc : 0x2271a5 DW_AT_high_pc : 0x15 <5>: Abbrev Number: 16 (DW_TAG_variable) DW_AT_location : 2 byte block: 91 70 (DW_OP_fbreg: -16) DW_AT_name : (indirect string, offset: 0x5ab6): x DW_AT_decl_file : 25 DW_AT_decl_line : 3 DW_AT_type : <0x1a4a> <5>: Abbrev Number: 14 (DW_TAG_lexical_block) DW_AT_low_pc : 0x2271ac DW_AT_high_pc : 0xe <6>: Abbrev Number: 16 (DW_TAG_variable) DW_AT_location : 2 byte block: 91 6c (DW_OP_fbreg: -20) DW_AT_name : (indirect string, offset: 0x16d3): y DW_AT_decl_file : 25 DW_AT_decl_line : 4 DW_AT_type : <0x1a4a> <6>: Abbrev Number: 14 (DW_TAG_lexical_block) DW_AT_low_pc : 0x2271b3 DW_AT_high_pc : 0x7 <7>: Abbrev Number: 16 (DW_TAG_variable) DW_AT_location : 2 byte block: 91 68 (DW_OP_fbreg: -24) DW_AT_name : (indirect string, offset: 0x721c): z DW_AT_decl_file : 25 DW_AT_decl_line : 5 DW_AT_type : <0x1a4a> <7>: Abbrev Number: 0 <6>: Abbrev Number: 0 <5>: Abbrev Number: 0 ```
andrewrk commented 5 years ago

Thanks - this is an important find. I actually didn't know why the debug info was messed up for those variables.

The scope structure does make sense in terms of semantic analysis; each variable opens up a new scope because after a declaration, that's where the variable becomes in scope. However looks like there needs to be some changes to how that gets emitted in terms of debug info.