Closed iacore closed 7 months ago
Sorry. That's not where I found the bug.
Here's where I found the bug:
git clone https://codeberg.org/iacore/kanren.zig/ --branch zigcoro
cd kanren.zig
zig build # build test binary
zig-out/bin/test # SIGSEGV here
I don't know how to reduce this test case. The crash happened when restoring registers.
$ lldb zig-out/bin/test
(lldb) target create "zig-out/bin/test"
Current executable set to '/home/user/computing/lib/zig/kanren/zig-out/bin/test' (x86_64).
(lldb) r
Process 21654 launched: '/home/user/computing/lib/zig/kanren/zig-out/bin/test' (x86_64)
Test [6/6] test.sudoku_4x4... Process 21654 stopped
* thread #1, name = 'test', stop reason = signal SIGSEGV: invalid address (fault address: 0x7ffff7dc3cc8)
frame #0: 0x000000000022953b test`main.unify(subst=<unavailable>, _l=<unavailable>, _r=<unavailable>) at main.zi
g:106
103 };
104
105 // always allocate new subst (if not null)
-> 106 pub fn unify(subst: SubstitutionMap, _l: Term, _r: Term) !?SubstitutionMap {
107 const l: Term = subst.lookup(_l);
108 const r: Term = subst.lookup(_r);
109 // std.log.warn("unify: {} {}", .{ l.*, r.* });
(lldb) di -f -m
** 106 pub fn unify(subst: SubstitutionMap, _l: Term, _r: Term) !?SubstitutionMap {
test`:
0x229530 <+0>: pushq %rbp
0x229531 <+1>: movq %rsp, %rbp
0x229534 <+4>: subq $0x830, %rsp ; imm = 0x830
-> 0x22953b <+11>: movq %r8, -0x7d8(%rbp)
0x229542 <+18>: movq %rcx, -0x800(%rbp)
0x229549 <+25>: movq %rdx, %rax
0x22954c <+28>: movq -0x800(%rbp), %rdx
0x229553 <+35>: movq %rax, -0x7e0(%rbp)
0x22955a <+42>: movq %rsi, -0x7f8(%rbp)
0x229561 <+49>: movq %rdi, -0x7f0(%rbp)
0x229568 <+56>: movq %rdi, -0x7e8(%rbp)
104
105 // always allocate new subst (if not null)
Ah ok. First thing to try is increasing the coroutine’s stack size to some high number just to see if this is an uncaught stack overflow. Have you tried that?
it is indeed stack overflow
Ok cool. Unfortunately this is not catchable in the general case. Sometimes we get lucky and it is detectable and you get a stack overflow error but other times you just get a crash and need to know to adjust the stack size.
It's not possible to catch normal stack overflow in Zig?
Normally it is, but these userspace coroutines use their own stacks and the compiler isn’t able to reason about them the same way. This of course will all be fine once Zig ships native async and zigcoro will (for the most part) not be necessary.
On Sun, Nov 19, 2023 at 4:40 AM iacore @.***> wrote:
It's not possible to catch normal stack overflow in Zig?
— Reply to this email directly, view it on GitHub https://github.com/rsepassi/zigcoro/issues/15#issuecomment-1817843324, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAIQMW2I5H66ROJGMUBYMCDYFH44LAVCNFSM6AAAAAA7G56KW6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMJXHA2DGMZSGQ . You are receiving this because you modified the open/close state.Message ID: @.***>
I'm curious. How does Zig catch stack overflow?
The compiler can compute the stack size needed per function/branch and knows how big the stack is. I think. Maybe ask more in the zig discord.
Interesting, so it looks like this is actually because of a type-incompatibility between
*anyopaque
andnull
that doesn't seem to be caught by the compiler.This more minimal thing fails:
While this works:
The only difference between the 2 being:
*anyopaque
vs?*anyopaque
.I'll take a look later on how to get this to error at compile time.
Let me know if you find that I've missed or misunderstood something.