ziglang / zig

General-purpose programming language and toolchain for maintaining robust, optimal, and reusable software.
https://ziglang.org
MIT License
34.48k stars 2.52k forks source link

in debug builds, global variables which are undefined should be initialized with 0xaa #1987

Open andrewrk opened 5 years ago

andrewrk commented 5 years ago

There may need to be an exception when the size of the global is larger than some amount such as 4 KiB, to avoid bloating binary size.

One way to do this is to put such variables in a special section, and then before main() memset the entire section to 0xaa. This would possibly require modifications to the default linker script.

<andrewrk> does LLD have a default linker script? how is that determined?
<andrewrk> is there a hard coded default linker script per OS?
<serge_sans_paille> andrewrk: if I recall correctly the talk at fosdem this year, there's a hard coded default linker script, but psmith knows better
<psmith> andrewrk: It is hard-coded, at least for ELF.
<psmith> COFF doesn't have linker-scripts at all AFAIK.
<andrewrk> psmith, great, so I could just copy it to my frontend, and make modifications, and then just make sure to synchronize it on every new LLVM release
<psmith> In theory yes. You'd need to reverse engineer it a bit from the code, but it should be fairly straightforward.
<psmith> andrewrk: I'd start with Writer.cpp and look for !Script->HasSectionsCommand 
<andrewrk> psmith, thanks for the tip!

Putting uninitialized globals in a custom linker section would also enable us to make a valgrind client request before main, after the memset to 0xaa. This would give Zig binaries a feature that C binaries don't have: valgrind able to detect uninitialized globals.

This would not be available if the user did export fn main and linked libc. But it would be available if they linked libc and used pub fn main.

mb64 commented 4 years ago

I may have run into this bug, and got some surprising behavior.

const std = @import("std");

const Point = struct {
    x: u32,
    y: u32,
};

fn print_point(point: Point) void {
    std.debug.warn("Point (0x{x}, 0x{x})\n", .{ point.x, point.y });
}

pub fn main() void {
    var my_point: Point = Point{ .x = undefined, .y = 5 };
    print_point(my_point);

    my_point = Point{ .x = undefined, .y = 5 };
    print_point(my_point);
}

Here's the output on my machine:

$ zig run test.zig
Point (0x0, 0x5)
Point (0xaaaaaaaa, 0x5)

I asked on Discord, and it was suggested that this was related to this bug, and that in the variable declaration, my_point is loaded from a global, but in the assignment it was loaded from immediates.

I'm not sure what the difference is, but I'm adding the example in case it's useful.