ziglang / zig

General-purpose programming language and toolchain for maintaining robust, optimal, and reusable software.
https://ziglang.org
MIT License
33.72k stars 2.47k forks source link

Segfault Before Main with Self-hosted Compiler #13190

Open ominitay opened 1 year ago

ominitay commented 1 year ago

Zig Version

0.10.0-dev.4418+99c3578f6

Steps to Reproduce

Attempt to build and run the attached Zig program with stage2 in either Debug or ReleaseSafe mode.

Expected Behavior

The program runs and outputs an ascii representation of the mandelbrot set.

Actual Behavior

The program segfaults before reaching main.

ominitay commented 1 year ago

This does not occur in the other two release modes, or in any release mode using the stage1 compiler.

ominitay commented 1 year ago

The file is available at this link, since GitHub doesn't allow me to attach Zig files. http://0x0.st/owzu.zig The source file isn't very readable, and is about 3kloc long, as it was generated by a brainfuck compiler.

mikdusan commented 1 year ago

This issue appears to manifest with stage2 but not with stage1. To be clear I am not sure this is a bug, it could just be that stage2 inlines better all the comptime known stuff.

Basically, stage2 generates a large 24 MiB frame which exceeds most default stack size limits. This compares to stage1 frame less than 64 KiB:

stage2 (25,381,504 bytes)

z1.main:
.Lfunc_begin15:
    .loc    22 3 0
    .cfi_startproc
    push    rbp
    .cfi_def_cfa_offset 16
    .cfi_offset rbp, -16
    mov rbp, rsp
    .cfi_def_cfa_register rbp
    mov eax, 25381504
    call    __zig_probe_stack

stage1 (57,680 bytes)

main:
.Lfunc_begin424:
    .file   51 "/home/mike/project/zig/work/main" "z1.zig"
    .loc    51 3 0
    .cfi_startproc
    push    rbp
    .cfi_def_cfa_offset 16
    .cfi_offset rbp, -16
    mov rbp, rsp
    .cfi_def_cfa_register rbp
    mov eax, 57680
    call    __zig_probe_stack

To work around this, you can force the cell array size to be "runtime" by making it a fn arg. Here's a quick patch:

--- z1.zig  2022-10-16 10:16:24.720551872 -0400
+++ z3.zig  2022-10-16 11:13:29.293411952 -0400
@@ -2,6 +2,10 @@

 pub fn main() void {
     var cells = std.mem.zeroes([30000]u8);
+    doit(&cells);
+}
+
+fn doit(cells: []u8) void {
     var index: u32 = 0;
     const stdout = std.io.getStdOut().writer();
mikdusan commented 1 year ago

for posterity OP's .zig source can be found here: owzu.zip

ominitay commented 1 year ago

This issue appears to manifest with stage2 but not with stage1. To be clear I am not sure this is a bug, it could just be that stage2 inlines better all the comptime known stuff.

Even if the problem were caused by the compiler excessively inlining, that would still be a bug imo. It's diverging behaviour between release modes, and causing valid Zig code to exhibit UB (stack overflow).

To work around this, you can force the cell array size to be "runtime" by making it a fn arg. Here's a quick patch:

Thanks for that, I should be able to use this workaround to get things working for the time being!

Vexu commented 1 year ago

I suspect this to be yet another case of #12215

matu3ba commented 1 year ago

At the very least, the compiler should

warn or what should the compiler do?

The problem Note, that the stack size can be dynamically adjusted, so to get things right an "expected stack size" would need to be given by the user inside build.zig. For example, threads can set own stack size: https://stackoverflow.com/questions/1678803/how-to-determine-stack-size-of-a-program-in-linux via pthread_attr_setstacksize() and change the Kernel defaults https://linux.die.net/man/2/setrlimit. Other OSes should have similar functionality. Handling all cases would require 1. no runtime sandbox to get Kernel access to current used option, 2. no stuff like pthread_attr_setstacksize or somehow computing the limit of what can be used at any possible runtime.

This is still a minimum needed runtime stack size which may be grossly wrong, the maximum runtime stack size can only be over-approximated.

Summary With that said, I think its not worth adding a check as local optimum for only this case unless we can rule out more classes of broken programs with user input on the program behavior.

Proposal

I think its more viable to provide a way to print out compile-time known minimal needed stack size, as the used frame size might be just below the threshold and we dont know how much runtime stack size the user wants/needs.