ziglang / zig

General-purpose programming language and toolchain for maintaining robust, optimal, and reusable software.
https://ziglang.org
MIT License
34.65k stars 2.53k forks source link

Memory corruption in different versions of code that are semantically identical, but written slightly differently #20902

Closed betonowy closed 2 months ago

betonowy commented 2 months ago

Zig Version

0.14.0-dev.764+eb1a199df

Steps to Reproduce and Observed Behavior

Extracted minimum reproducible example in the file attached below. 223 LOCs, I couldn't cut it down more, sorry. bug.zig.txt

While I checked this and reproduced it with bleeding edge master (0.14.0-dev.764+eb1a199df) that I pulled today, I only ended up here, because it behaved the same on 0.13.0-dev.351+64ef45eb0 which I used before and update didn't fix it. (I cleared caches just in case).

From the observed behavior I assume that there is no bug in my code, because all versions should work identically as far as I can see.

In miscompiled versions you can observe corruption by putting a breakpoint (I used LLDB) where the comment suggests, stepping over, and observing how self.parent.capacity variable gets corrupted and program will try to go out of bounds of self.keys[div] because of it and will trigger a runtime check. FYI KeyMask is just a u64.

You can put breakpoint in kind of the same spot in next() and nextBugged2(). The first one will not corrupt memory, the other one will.

Here is a function in the iterator that works correctly in this version (i.e. is not miscompiled). I run it with zig test --test-filter "This one works with 'next()'" bug.zig

            // This version of 'next()' function works correctly
            pub fn next(self: *@This()) ?*T {
                while (self.cursor < self.parent.capacity) {
                    defer self.cursor += 1;

                    const div = self.cursor / @bitSizeOf(KeyMask);
                    const mod: u6 = @intCast(self.cursor % @bitSizeOf(KeyMask));

                    if ((self.keys[div] & (@as(KeyMask, 1) << mod)) == 0) continue;

                    const p = &self.data[self.cursor];
                    return p;
                }

                return null;
            }

This version is miscompilled and will corrupt memory. It seems bizarre. This code is not even supposed to touch this memory. I run it with zig test --test-filter "This one bugs with 'nextBugged1()'" bug.zig

            // This version of 'next()' function corrupts self.parent
            // In this version there is no temporary variable 'p'
            pub fn nextBugged1(self: *@This()) ?*T {
                while (self.cursor < self.parent.capacity) {
                    defer self.cursor += 1;

                    const div = self.cursor / @bitSizeOf(KeyMask);
                    const mod: u6 = @intCast(self.cursor % @bitSizeOf(KeyMask));

                    if ((self.keys[div] & (@as(KeyMask, 1) << mod)) == 0) continue;

                    return &self.data[self.cursor]; // Corrupts after this line (put breakpoint here)
                }

                return null;
            }

This version is miscompilled in seemingly exact same way (i.e. results in the same kind of memory corruption), even though the change in the zig code, compared to the working version, was in a different spot. I run it with zig test --test-filter "This one bugs with 'nextBugged2()'" bug.zig

            // This version of 'next()' function corrupts self.parent
            // In this version I changed when 'mod' is @intCast'ed
            pub fn nextBugged2(self: *@This()) ?*T {
                while (self.cursor < self.parent.capacity) {
                    defer self.cursor += 1;

                    const div = self.cursor / @bitSizeOf(KeyMask);
                    const mod = self.cursor % @bitSizeOf(KeyMask);

                    if ((self.keys[div] & (@as(KeyMask, 1) << @intCast(mod))) == 0) continue;

                    const p = &self.data[self.cursor]; // Corrupts after this line (put breakpoint here)
                    return p;
                }

                return null;
            }

Expected Behavior

Code is not miscompiled in any of the cases above

squeek502 commented 2 months ago

I believe this is a bug in your code, right here:

       pub fn iterator(self: @This()) Iterator {
            return .{
                .parent = &self,

You are getting a pointer to a parameter, which Zig decides how its passed internally. In this case, you are taking/storing a pointer to stack memory, and that memory goes out-of-scope when the function returns.

To verify this, if you added this assert to your test, it would fail (even though it should succeed):

    var it = ia.iterator();
    std.debug.assert(it.parent == &ia);

An easy fix would be to change iterator to be:

        pub fn iterator(self: *const @This()) Iterator {
            return .{
                .parent = self,

In the future, please direct questions like this to one of the community spaces first

betonowy commented 2 months ago

Thank you for the quick answer. This was indeed an extremely nooby mistake from my side. Sorry, for polluting the space. Though, I do think this qualifies for some kind of error message, or a warning.

squeek502 commented 2 months ago

I believe the relevant issues are: