ziglang / zig

General-purpose programming language and toolchain for maintaining robust, optimal, and reusable software.
https://ziglang.org
MIT License
34.88k stars 2.55k forks source link

SIGABRT in std.process.Child.run outside Debug mode #17704

Open LordMZTE opened 1 year ago

LordMZTE commented 1 year ago

Zig Version

0.12.0-dev.1245+a07f288eb

Steps to Reproduce and Observed Behavior

  1. Create a file bug.zig with the following content:
    
    const std = @import("std");

pub fn main() !void { var result = try std.process.Child.run(.{ .allocator = std.heap.page_allocator, .argv = &.{ "echo", "hello" }, // The command here seems irrelevant. });

defer std.heap.page_allocator.free(result.stdout);
defer std.heap.page_allocator.free(result.stderr);

std.debug.print("{s}\n", .{result.stdout});

}

2. Run `zig run bug.zig`. This will build in debug mode and print `hello` as expected.
3. Run `zig run bug.zig -OReleaseSafe`. This will result in the following runtime error:

thread 28090 panic: reached unreachable code /home/lordmzte/.local/share/zupper/installs/master/lib/std/start.zig:0:0: 0x20d274 in posixCallMainAndExit (test) /home/lordmzte/.local/share/zupper/installs/master/lib/std/start.zig:251:5: 0x20c711 in _start (test) asm volatile (switch (native_arch) { ^ ???:?:?: 0x0 in ??? (???) fish: Job 1, 'zig run bug.zig -OReleaseSafe' terminated by signal SIGABRT (Abort)


GDB reveals this stack trace:

0 os.sigprocmask (flags=2, set=0x7fffffffce50, oldset=0x0)

at /home/lordmzte/.local/share/zupper/installs/master/lib/std/os.zig:5705

1 os.raise (sig=) at /home/lordmzte/.local/share/zupper/installs/master/lib/std/os.zig:631

2 0x0000000000236adf in os.abort () at /home/lordmzte/.local/share/zupper/installs/master/lib/std/os.zig:573

3 0x0000000000236926 in debug.panicImpl (trace=0x0, first_trace_addr=..., msg=...)

at /home/lordmzte/.local/share/zupper/installs/master/lib/std/debug.zig:440

4 0x0000000000235b6a in builtin.default_panic (msg=..., error_return_trace=0x0, ret_addr=...)

at /home/lordmzte/.local/share/zupper/installs/master/lib/std/builtin.zig:813

5 0x000000000020d275 in start.posixCallMainAndExit ()

at /home/lordmzte/.local/share/zupper/installs/master/lib/std/child_process.zig:502

6 0x000000000020c6f2 in _start () at /home/lordmzte/.local/share/zupper/installs/master/lib/std/start.zig:251



### Expected Behavior

The program runs the same in `Debug` and `ReleaseSafe` modes.
rootbeer commented 1 year ago

Reproduced. If I change the allocator to a GeneralPurposeAllocator the problem goes away:

const std = @import("std");

pub fn main() !void {
    var gpa = std.heap.GeneralPurposeAllocator(.{}){};
    var result = try std.process.Child.run(.{
        .allocator = gpa.allocator(),
        .argv = &.{ "echo", "hello" }, // The command here seems irrelevant.
    });

    defer std.heap.page_allocator.free(result.stdout);
    defer std.heap.page_allocator.free(result.stderr);

    std.debug.print("{s}\n", .{result.stdout});
}

With the broken version, removing the 'defers' and print doesn't help. This still fails:

const std = @import("std");

pub fn main() !void {
    _ = try std.process.Child.run(.{
        .allocator = std.heap.page_allocator,
        .argv = &.{ "echo", "hello" }, // The command here seems irrelevant.
    });
}

And strace -f shows the child process runs successfully. The crash seems to happen after the parent resumes.

rootbeer commented 1 year ago

I can't reproduce this anymore. I've been poking at this occasionally in the last several days to see if I can narrow the problem down, but after updating my tree and recompiling Zig, this no longer fails for me. I'm running 0.12.0-dev.1357+10d03acdb.

I'm not so interested in this problem that I'll binary search what fixed it, though. So, if someone wants an excuse to learn how to git bisect this might be an interesting case ...

LordMZTE commented 1 year ago

Can confirm, this seems fixed. Will close the issue for now.

SHIPWECK commented 3 weeks ago

Currently running 0.14.0-dev.719+f21928657 and still having this issue, but only with -OReleaseFast and -OReleaseSmall, not -OReleaseSafe. Similarly to the original problem, the program runs fine with GeneralPurposeAllocator.

const std = @import("std");

pub fn main() !void {
    var alloc = std.heap.page_allocator;

    const run_result = try std.process.Child.run(.{
        .allocator = alloc,
        .argv = &.{ "zig", "version" },
    });

    defer {
        alloc.free(run_result.stderr);
        alloc.free(run_result.stdout);
    }

    std.debug.print("{s}", .{run_result.stdout});
}

The output is also extremely minimal, only showing this:

PS /> zig run bug.zig -OReleaseFast
error: 
PS /> zig run bug.zig -OReleaseSmall
error: 
PS />

Runs totally fine in Debug and ReleaseSafe.

Edit: Using zig test, i was able to get some more information on the crash, though I can make no sense of it. Here is the code along with the output: bug.zig:

test "Child.run" {
    var alloc = std.heap.page_allocator;

    const run_result = try std.process.Child.run(.{
        .allocator = alloc,
        .argv = &.{ "zig", "version" },
    });

    defer {
        alloc.free(run_result.stderr);
        alloc.free(run_result.stdout);
    }

    try std.testing.expectEqualStrings(
        "0.14.0-dev.719+f21928657\n",
        run_result.stdout,
    );
}

output:

PS H:\Users\____\zig> zig test bug.zig -OReleaseSmall
All 1 tests passed.
PS H:\Users\____\zig> zig test bug.zig -OReleaseSafe 
All 1 tests passed.
PS H:\Users\____\zig> zig test bug.zig -ODebug       
All 1 tests passed.
PS H:\Users\____\zig> zig test bug.zig -OReleaseFast 
1/1 bug.test.Child.run...FAIL ()
0 passed; 0 skipped; 1 failed.
error: the following test command failed with exit code 1:
C:\Users\____\AppData\Local\zig\o\f08007e9b8d201450c8ddf23221d5620\test.exe --seed=0x3c9fd1d8

Note that the test passes for all compilation options when using std.testing.allocator.

LordMZTE commented 3 weeks ago

Interestingly, this fails with OutOfMemory on 0.13.0 in ReleaseFast, while it works in Debug:

$ zig run test.zig -OReleaseFast
error: OutOfMemory

I'll reopen this.

rootbeer commented 3 weeks ago

https://github.com/ziglang/zig/issues/21756 looks similar (and like it might have some leads)

squeek502 commented 3 weeks ago

Yep, looks like the same bug. I believe https://github.com/ziglang/zig/pull/21760 will fix this.