gimli-rs / unwind-rs

Apache License 2.0
26 stars 10 forks source link

Demo currently doesn't work. #22

Open roblabla opened 5 years ago

roblabla commented 5 years ago

When running the demo example on my machine (64-bit linux) using a new-ish nightly (rustc 1.33.0-nightly (b2b7a063a 2019-01-01)), unwind-rs crashes.

[roblabla@roblab unwind-rs]$ cargo run --example demo
    Finished dev [unoptimized + debuginfo] target(s) in 0.05s
     Running `target/debug/examples/demo`
thread 'main' panicked at 'test panic', unwind/examples/demo.rs:13:5
note: Run with `RUST_BACKTRACE=1` environment variable to display a backtrace.
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: NoUnwindInfoForAddress', src/libcore/result.rs:999:5
thread panicked while processing panic. aborting.
Illegal instruction (core dumped)

Did something break?

``` thread 'main' panicked at 'test panic', unwind/examples/demo.rs:13:5 note: Run with `RUST_BACKTRACE=1` environment variable to display a backtrace. [2019-01-03T23:20:27Z TRACE unwind::find_cfi::imp] "" at 0x55eeb9d7b000 with 12 segments [2019-01-03T23:20:27Z TRACE unwind::find_cfi::imp] "linux-vdso.so.1" at 0x7ffeca7bd000 with 4 segments [2019-01-03T23:20:27Z TRACE unwind::find_cfi::imp] "/usr/lib/libdl.so.2" at 0x7f0c1e6bb000 with 10 segments [2019-01-03T23:20:27Z TRACE unwind::find_cfi::imp] "/usr/lib/librt.so.1" at 0x7f0c1e6b1000 with 10 segments [2019-01-03T23:20:27Z TRACE unwind::find_cfi::imp] "/usr/lib/libpthread.so.0" at 0x7f0c1e690000 with 12 segments [2019-01-03T23:20:27Z TRACE unwind::find_cfi::imp] "/usr/lib/libc.so.6" at 0x7f0c1e4cc000 with 13 segments [2019-01-03T23:20:27Z TRACE unwind::find_cfi::imp] "/lib64/ld-linux-x86-64.so.2" at 0x7f0c1e6e6000 with 10 segments [2019-01-03T23:20:27Z TRACE unwind::find_cfi::imp] CFI sections: [EhRef { obj_base: 94483808497664, text: AddrRange { start: 94483808497664, end: 94483808814184 }, cfi: AddrRange { start: 94483811286948, end: 94483811366512 }, ehframe_end: 94483811902144 }, EhRef { obj_base: 140732295532544, text: AddrRange { start: 140732295532544, end: 140732295536411 }, cfi: AddrRange { start: 140732295534548, end: 140732295534616 }, ehframe_end: 140732295536411 }, EhRef { obj_base: 139690026708992, text: AddrRange { start: 139690026708992, end: 139690026712592 }, cfi: AddrRange { start: 139690026717348, end: 139690026717544 }, ehframe_end: 139690026725520 }, EhRef { obj_base: 139690026668032, text: AddrRange { start: 139690026668032, end: 139690026676000 }, cfi: AddrRange { start: 139690026693472, end: 139690026694044 }, ehframe_end: 139690026707448 }, EhRef { obj_base: 139690026532864, text: AddrRange { start: 139690026532864, end: 139690026556824 }, cfi: AddrRange { start: 139690026621920, end: 139690026624380 }, ehframe_end: 139690026664392 }, EhRef { obj_base: 139690024681472, text: AddrRange { start: 139690024681472, end: 139690024817672 }, cfi: AddrRange { start: 139690026312016, end: 139690026337076 }, ehframe_end: 139690026530368 }, EhRef { obj_base: 139690026885120, text: AddrRange { start: 139690026885120, end: 139690026889328 }, cfi: AddrRange { start: 139690027038432, end: 139690027040212 }, ehframe_end: 139690027061528 }] [2019-01-03T23:20:27Z TRACE unwind] cfi at 0x55eeba037670 sz 82c50 [2019-01-03T23:20:27Z TRACE unwind] cfi at 0x7ffeca7bd818 sz 703 [2019-01-03T23:20:27Z TRACE unwind] cfi at 0x7f0c1e6bd168 sz 1f28 [2019-01-03T23:20:27Z TRACE unwind] cfi at 0x7f0c1e6b75a0 sz 3458 [2019-01-03T23:20:27Z TRACE unwind] cfi at 0x7f0c1e6a6580 sz 9c48 [2019-01-03T23:20:27Z TRACE unwind] cfi at 0x7f0c1e660338 sz 2f308 [2019-01-03T23:20:27Z TRACE unwind] cfi at 0x7f0c1e70bdd8 sz 5340 [2019-01-03T23:20:27Z DEBUG unwind] caller is 0x55eeb9fa3519 thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: NoUnwindInfoForAddress', src/libcore/result.rs:999:5 [2019-01-03T23:20:27Z TRACE unwind::find_cfi::imp] "" at 0x55eeb9d7b000 with 12 segments [2019-01-03T23:20:27Z TRACE unwind::find_cfi::imp] "linux-vdso.so.1" at 0x7ffeca7bd000 with 4 segments [2019-01-03T23:20:27Z TRACE unwind::find_cfi::imp] "/usr/lib/libdl.so.2" at 0x7f0c1e6bb000 with 10 segments [2019-01-03T23:20:27Z TRACE unwind::find_cfi::imp] "/usr/lib/librt.so.1" at 0x7f0c1e6b1000 with 10 segments [2019-01-03T23:20:27Z TRACE unwind::find_cfi::imp] "/usr/lib/libpthread.so.0" at 0x7f0c1e690000 with 12 segments [2019-01-03T23:20:27Z TRACE unwind::find_cfi::imp] "/usr/lib/libc.so.6" at 0x7f0c1e4cc000 with 13 segments [2019-01-03T23:20:27Z TRACE unwind::find_cfi::imp] "/lib64/ld-linux-x86-64.so.2" at 0x7f0c1e6e6000 with 10 segments [2019-01-03T23:20:27Z TRACE unwind::find_cfi::imp] CFI sections: [EhRef { obj_base: 94483808497664, text: AddrRange { start: 94483808497664, end: 94483808814184 }, cfi: AddrRange { start: 94483811286948, end: 94483811366512 }, ehframe_end: 94483811902144 }, EhRef { obj_base: 140732295532544, text: AddrRange { start: 140732295532544, end: 140732295536411 }, cfi: AddrRange { start: 140732295534548, end: 140732295534616 }, ehframe_end: 140732295536411 }, EhRef { obj_base: 139690026708992, text: AddrRange { start: 139690026708992, end: 139690026712592 }, cfi: AddrRange { start: 139690026717348, end: 139690026717544 }, ehframe_end: 139690026725520 }, EhRef { obj_base: 139690026668032, text: AddrRange { start: 139690026668032, end: 139690026676000 }, cfi: AddrRange { start: 139690026693472, end: 139690026694044 }, ehframe_end: 139690026707448 }, EhRef { obj_base: 139690026532864, text: AddrRange { start: 139690026532864, end: 139690026556824 }, cfi: AddrRange { start: 139690026621920, end: 139690026624380 }, ehframe_end: 139690026664392 }, EhRef { obj_base: 139690024681472, text: AddrRange { start: 139690024681472, end: 139690024817672 }, cfi: AddrRange { start: 139690026312016, end: 139690026337076 }, ehframe_end: 139690026530368 }, EhRef { obj_base: 139690026885120, text: AddrRange { start: 139690026885120, end: 139690026889328 }, cfi: AddrRange { start: 139690027038432, end: 139690027040212 }, ehframe_end: 139690027061528 }] [2019-01-03T23:20:27Z TRACE unwind] cfi at 0x55eeba037670 sz 82c50 [2019-01-03T23:20:27Z TRACE unwind] cfi at 0x7ffeca7bd818 sz 703 [2019-01-03T23:20:27Z TRACE unwind] cfi at 0x7f0c1e6bd168 sz 1f28 [2019-01-03T23:20:27Z TRACE unwind] cfi at 0x7f0c1e6b75a0 sz 3458 [2019-01-03T23:20:27Z TRACE unwind] cfi at 0x7f0c1e6a6580 sz 9c48 [2019-01-03T23:20:27Z TRACE unwind] cfi at 0x7f0c1e660338 sz 2f308 [2019-01-03T23:20:27Z TRACE unwind] cfi at 0x7f0c1e70bdd8 sz 5340 [2019-01-03T23:20:27Z DEBUG unwind] caller is 0x55eeb9fa3580 thread panicked while processing panic. aborting. Illegal instruction (core dumped) ```
roblabla commented 5 years ago

This happens because the caller address is outside the text region we find. The ld find_cfi method calculates the text region by looking at the first PT_LOAD. Unfortunately, in this binary's case, the first PT_LOAD is not the executable, but a read-only bit. So the text region we find is actually not the .text, but some other segment.

I believe we should use the flags to find the executable section.

``` Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align PHDR 0x000040 0x0000000000000040 0x0000000000000040 0x0002a0 0x0002a0 R 0x8 INTERP 0x0002e0 0x00000000000002e0 0x00000000000002e0 0x00001c 0x00001c R 0x1 [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2] LOAD 0x000000 0x0000000000000000 0x0000000000000000 0x04d510 0x04d510 R 0x1000 LOAD 0x04e000 0x000000000004e000 0x000000000004e000 0x2150f5 0x2150f5 R E 0x1000 LOAD 0x264000 0x0000000000264000 0x0000000000264000 0x0aa2e8 0x0aa2e8 R 0x1000 LOAD 0x30e900 0x000000000030f900 0x000000000030f900 0x030790 0x0309c0 RW 0x1000 DYNAMIC 0x334778 0x0000000000335778 0x0000000000335778 0x000230 0x000230 RW 0x8 NOTE 0x0002fc 0x00000000000002fc 0x00000000000002fc 0x000044 0x000044 R 0x4 TLS 0x30e900 0x000000000030f900 0x000000000030f900 0x000050 0x000120 R 0x20 GNU_EH_FRAME 0x2aa014 0x00000000002aa014 0x00000000002aa014 0x0136fc 0x0136fc R 0x4 GNU_STACK 0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW 0x10 GNU_RELRO 0x30e900 0x000000000030f900 0x000000000030f900 0x030700 0x030700 R 0x1 Section to Segment mapping: Segment Sections... 00 01 .interp 02 .interp .note.ABI-tag .note.gnu.build-id .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn .rela.plt 03 .init .plt .plt.got .text .fini 04 .rodata .debug_gdb_scripts .eh_frame_hdr .eh_frame .gcc_except_table 05 .tdata .init_array .fini_array .data.rel.ro .dynamic .got .data .bss 06 .dynamic 07 .note.ABI-tag .note.gnu.build-id 08 .tdata .tbss 09 .eh_frame_hdr 10 11 .tdata .init_array .fini_array .data.rel.ro .dynamic .got ```
roblabla commented 5 years ago

Fixed the above problem with #23. Unfortunately, it's not enough. I end up executing an ud2, causing an illegal instruction exception. Freaky. Here's the backtrace from gdb:

(gdb) bt
#0  _Unwind_RaiseException (exception=0x555555894db0) at unwind/src/libunwind_shim.rs:111
#1  0x000055555579fa59 in rust_panic () at src/libstd/panicking.rs:527
#2  0x000055555579fa32 in std::panicking::rust_panic_with_hook () at src/libstd/panicking.rs:498
#3  0x000055555575b6e8 in std::panicking::begin_panic (msg=..., file_line_col=0x555555862a10)
    at /rustc/b2b7a063af39455d7362524da3123c34c3f4842e/src/libstd/panicking.rs:412
#4  0x00005555555a246d in demo::bar () at unwind/examples/demo.rs:13
#5  0x00005555555a24b3 in demo::foo () at unwind/examples/demo.rs:18
#6  0x00005555555a24ef in demo::main () at unwind/examples/demo.rs:24

And the asm:

0x55555575be10 <_Unwind_RaiseException>         sub    rsp,0x48
0x55555575be14 <_Unwind_RaiseException+4>       mov    QWORD PTR [rsp+0x8],rdi
0x55555575be19 <_Unwind_RaiseException+9>       mov    rdi,QWORD PTR [rsp+0x8]
0x55555575be1e <_Unwind_RaiseException+14>      mov    QWORD PTR [rdi+0x10],0x0
0x55555575be26 <_Unwind_RaiseException+22>      lea    rdi,[rip+0x1c0a3]        # 0x555555777ed0 <<unwind::DwarfUnwinder as core::default::Default>::default>
0x55555575be2d <_Unwind_RaiseException+29>      lea    rax,[rsp+0x10]
0x55555575be32 <_Unwind_RaiseException+34>      mov    QWORD PTR [rsp],rdi
0x55555575be36 <_Unwind_RaiseException+38>      mov    rdi,rax
0x55555575be39 <_Unwind_RaiseException+41>      mov    rax,QWORD PTR [rsp]
0x55555575be3d <_Unwind_RaiseException+45>      call   rax
0x55555575be3f <_Unwind_RaiseException+47>      jmp    0x55555575be45 <_Unwind_RaiseException+53>
0x55555575be41 <_Unwind_RaiseException+49>      ud2
0x55555575be43 <_Unwind_RaiseException+51>      ud2
0x55555575be45 <_Unwind_RaiseException+53>      lea    rax,[rsp+0x8]
0x55555575be4a <_Unwind_RaiseException+58>      mov    QWORD PTR [rsp+0x30],rax

The binary is at https://dl.roblab.la/demo , and verbose logs in the details below. I'm going to sleep on this for now, hoping for a stroke of genius.

``` Finished dev [unoptimized + debuginfo] target(s) in 0.14s Running `target/debug/examples/demo` thread 'main' panicked at 'test panic', unwind/examples/demo.rs:13:5 note: Run with `RUST_BACKTRACE=1` environment variable to display a backtrace. [2019-01-04T00:29:29Z TRACE unwind::find_cfi::imp] Phdr64 { type_: 1, flags: 5, offset: 319488, vaddr: 319488, paddr: 319488, filesz: 2181541, memsz: 2181541, align: 4096 } [2019-01-04T00:29:29Z TRACE unwind::find_cfi::imp] Phdr64 { type_: 1, flags: 5, offset: 0, vaddr: 0, paddr: 0, filesz: 3867, memsz: 3867, align: 4096 } [2019-01-04T00:29:29Z TRACE unwind::find_cfi::imp] Phdr64 { type_: 1, flags: 5, offset: 4096, vaddr: 4096, paddr: 4096, filesz: 3609, memsz: 3609, align: 4096 } [2019-01-04T00:29:29Z TRACE unwind::find_cfi::imp] Phdr64 { type_: 1, flags: 5, offset: 8192, vaddr: 8192, paddr: 8192, filesz: 13077, memsz: 13077, align: 4096 } [2019-01-04T00:29:29Z TRACE unwind::find_cfi::imp] Phdr64 { type_: 1, flags: 5, offset: 24576, vaddr: 24576, paddr: 24576, filesz: 58389, memsz: 58389, align: 4096 } [2019-01-04T00:29:29Z TRACE unwind::find_cfi::imp] Phdr64 { type_: 1, flags: 5, offset: 139264, vaddr: 139264, paddr: 139264, filesz: 1353770, memsz: 1353770, align: 4096 } [2019-01-04T00:29:29Z TRACE unwind::find_cfi::imp] Phdr64 { type_: 1, flags: 5, offset: 8192, vaddr: 8192, paddr: 8192, filesz: 124804, memsz: 124804, align: 4096 } [2019-01-04T00:29:29Z TRACE unwind::find_cfi::imp] CFI sections: [EhRef { obj_base: 559a3f4bc000, text: AddrRange { start: 559a3f50a000, end: 559a3f71e9a5 }, cfi: AddrRange { start: 559a3f764eb4, end: 559a3f778578 }, ehframe_end: 559a3f7fb2c0 }, EhRef { obj_base: 7fff44fa7000, text: AddrRange { start: 7fff44fa7000, end: 7fff44fa7f1b }, cfi: AddrRange { start: 7fff44fa77d4, end: 7fff44fa7818 }, ehframe_end: 7fff44fa7f1b }, EhRef { obj_base: 7f6d3717e000, text: AddrRange { start: 7f6d3717f000, end: 7f6d3717fe19 }, cfi: AddrRange { start: 7f6d371800a4, end: 7f6d37180168 }, ehframe_end: 7f6d37182090 }, EhRef { obj_base: 7f6d37174000, text: AddrRange { start: 7f6d37176000, end: 7f6d37179315 }, cfi: AddrRange { start: 7f6d3717a360, end: 7f6d3717a59c }, ehframe_end: 7f6d3717d9f8 }, EhRef { obj_base: 7f6d37153000, text: AddrRange { start: 7f6d37159000, end: 7f6d37167415 }, cfi: AddrRange { start: 7f6d37168be0, end: 7f6d3716957c }, ehframe_end: 7f6d371731c8 }, EhRef { obj_base: 7f6d36f8f000, text: AddrRange { start: 7f6d36fb1000, end: 7f6d370fb82a }, cfi: AddrRange { start: 7f6d3711d150, end: 7f6d37123334 }, ehframe_end: 7f6d37152640 }, EhRef { obj_base: 7f6d371a9000, text: AddrRange { start: 7f6d371ab000, end: 7f6d371c9784 }, cfi: AddrRange { start: 7f6d371ce6e0, end: 7f6d371cedd4 }, ehframe_end: 7f6d371d4118 }] [2019-01-04T00:29:29Z TRACE unwind] cfi at 0x559a3f778578 sz 82d48 [2019-01-04T00:29:29Z TRACE unwind] cfi at 0x7fff44fa7818 sz 703 [2019-01-04T00:29:29Z TRACE unwind] cfi at 0x7f6d37180168 sz 1f28 [2019-01-04T00:29:29Z TRACE unwind] cfi at 0x7f6d3717a5a0 sz 3458 [2019-01-04T00:29:29Z TRACE unwind] cfi at 0x7f6d37169580 sz 9c48 [2019-01-04T00:29:29Z TRACE unwind] cfi at 0x7f6d37123338 sz 2f308 [2019-01-04T00:29:29Z TRACE unwind] cfi at 0x7f6d371cedd8 sz 5340 [2019-01-04T00:29:29Z DEBUG unwind] caller is 0x559a3f6e4529 [2019-01-04T00:29:29Z TRACE unwind] ok: RegisterAndOffset { register: 7, offset: 64 } (0x559a3f6e44f4 - 0x559a3f6e4540) [2019-01-04T00:29:29Z TRACE unwind] cfa is 0x7fff44f92770 [2019-01-04T00:29:29Z TRACE unwind::libunwind_shim] HAS PERSONALITY [2019-01-04T00:29:29Z TRACE unwind::find_cfi::imp] Phdr64 { type_: 1, flags: 5, offset: 319488, vaddr: 319488, paddr: 319488, filesz: 2181541, memsz: 2181541, align: 4096 } [2019-01-04T00:29:29Z TRACE unwind::find_cfi::imp] Phdr64 { type_: 1, flags: 5, offset: 0, vaddr: 0, paddr: 0, filesz: 3867, memsz: 3867, align: 4096 } [2019-01-04T00:29:29Z TRACE unwind::find_cfi::imp] Phdr64 { type_: 1, flags: 5, offset: 4096, vaddr: 4096, paddr: 4096, filesz: 3609, memsz: 3609, align: 4096 } [2019-01-04T00:29:29Z TRACE unwind::find_cfi::imp] Phdr64 { type_: 1, flags: 5, offset: 8192, vaddr: 8192, paddr: 8192, filesz: 13077, memsz: 13077, align: 4096 } [2019-01-04T00:29:29Z TRACE unwind::find_cfi::imp] Phdr64 { type_: 1, flags: 5, offset: 24576, vaddr: 24576, paddr: 24576, filesz: 58389, memsz: 58389, align: 4096 } [2019-01-04T00:29:29Z TRACE unwind::find_cfi::imp] Phdr64 { type_: 1, flags: 5, offset: 139264, vaddr: 139264, paddr: 139264, filesz: 1353770, memsz: 1353770, align: 4096 } [2019-01-04T00:29:29Z TRACE unwind::find_cfi::imp] Phdr64 { type_: 1, flags: 5, offset: 8192, vaddr: 8192, paddr: 8192, filesz: 124804, memsz: 124804, align: 4096 } [2019-01-04T00:29:29Z TRACE unwind::find_cfi::imp] CFI sections: [EhRef { obj_base: 559a3f4bc000, text: AddrRange { start: 559a3f50a000, end: 559a3f71e9a5 }, cfi: AddrRange { start: 559a3f764eb4, end: 559a3f778578 }, ehframe_end: 559a3f7fb2c0 }, EhRef { obj_base: 7fff44fa7000, text: AddrRange { start: 7fff44fa7000, end: 7fff44fa7f1b }, cfi: AddrRange { start: 7fff44fa77d4, end: 7fff44fa7818 }, ehframe_end: 7fff44fa7f1b }, EhRef { obj_base: 7f6d3717e000, text: AddrRange { start: 7f6d3717f000, end: 7f6d3717fe19 }, cfi: AddrRange { start: 7f6d371800a4, end: 7f6d37180168 }, ehframe_end: 7f6d37182090 }, EhRef { obj_base: 7f6d37174000, text: AddrRange { start: 7f6d37176000, end: 7f6d37179315 }, cfi: AddrRange { start: 7f6d3717a360, end: 7f6d3717a59c }, ehframe_end: 7f6d3717d9f8 }, EhRef { obj_base: 7f6d37153000, text: AddrRange { start: 7f6d37159000, end: 7f6d37167415 }, cfi: AddrRange { start: 7f6d37168be0, end: 7f6d3716957c }, ehframe_end: 7f6d371731c8 }, EhRef { obj_base: 7f6d36f8f000, text: AddrRange { start: 7f6d36fb1000, end: 7f6d370fb82a }, cfi: AddrRange { start: 7f6d3711d150, end: 7f6d37123334 }, ehframe_end: 7f6d37152640 }, EhRef { obj_base: 7f6d371a9000, text: AddrRange { start: 7f6d371ab000, end: 7f6d371c9784 }, cfi: AddrRange { start: 7f6d371ce6e0, end: 7f6d371cedd4 }, ehframe_end: 7f6d371d4118 }] [2019-01-04T00:29:29Z TRACE unwind] cfi at 0x559a3f778578 sz 82d48 [2019-01-04T00:29:29Z TRACE unwind] cfi at 0x7fff44fa7818 sz 703 [2019-01-04T00:29:29Z TRACE unwind] cfi at 0x7f6d37180168 sz 1f28 [2019-01-04T00:29:29Z TRACE unwind] cfi at 0x7f6d3717a5a0 sz 3458 [2019-01-04T00:29:29Z TRACE unwind] cfi at 0x7f6d37169580 sz 9c48 [2019-01-04T00:29:29Z TRACE unwind] cfi at 0x7f6d37123338 sz 2f308 [2019-01-04T00:29:29Z TRACE unwind] cfi at 0x7f6d371cedd8 sz 5340 [2019-01-04T00:29:29Z DEBUG unwind] caller is 0x559a3f6e45f9 [2019-01-04T00:29:29Z TRACE unwind] ok: RegisterAndOffset { register: 7, offset: 64 } (0x559a3f6e45c4 - 0x559a3f6e4610) [2019-01-04T00:29:29Z TRACE unwind] cfa is 0x7fff44f926e0 [2019-01-04T00:29:29Z TRACE unwind] rule 16 Offset(-8) [2019-01-04T00:29:29Z TRACE unwind] registers: XXX XXX XXX 0x0 XXX XXX 0x559a3f760e50 0x7fff44f926e0 XXX XXX XXX XXX 0x0 0x17 0x5 0x1 0x559a3f6c3ab5 [2019-01-04T00:29:29Z DEBUG unwind] caller is 0x559a3f6c3ab4 [2019-01-04T00:29:29Z TRACE unwind] ok: RegisterAndOffset { register: 7, offset: 80 } (0x559a3f6c3a74 - 0x559a3f6c3b0e) [2019-01-04T00:29:29Z TRACE unwind] cfa is 0x7fff44f92730 [2019-01-04T00:29:29Z TRACE unwind] rule 16 Offset(-8) [2019-01-04T00:29:29Z TRACE unwind] registers: XXX XXX XXX 0x0 XXX XXX 0x559a3f760e50 0x7fff44f92730 XXX XXX XXX XXX 0x0 0x17 0x5 0x1 0x559a3f6e4536 [2019-01-04T00:29:29Z DEBUG unwind] caller is 0x559a3f6e4535 [2019-01-04T00:29:29Z TRACE unwind] ok: RegisterAndOffset { register: 7, offset: 64 } (0x559a3f6e44f4 - 0x559a3f6e4540) [2019-01-04T00:29:29Z TRACE unwind] cfa is 0x7fff44f92770 [2019-01-04T00:29:29Z TRACE unwind] rule 16 Offset(-8) [2019-01-04T00:29:29Z TRACE unwind] registers: XXX XXX XXX 0x0 XXX XXX 0x559a3f760e50 0x7fff44f92770 XXX XXX XXX XXX 0x0 0x17 0x5 0x1 0x559a3f6c3e62 [2019-01-04T00:29:29Z DEBUG unwind] caller is 0x559a3f6c3e61 [2019-01-04T00:29:29Z TRACE unwind] ok: RegisterAndOffset { register: 7, offset: 80 } (0x559a3f6c3e14 - 0x559a3f6c3ebb) [2019-01-04T00:29:29Z TRACE unwind] cfa is 0x7fff44f927c0 [2019-01-04T00:29:29Z TRACE unwind::libunwind_shim] HAS PERSONALITY Illegal instruction (core dumped) ```
kitlith commented 5 years ago

looks like the line it's dying on is https://github.com/gimli-rs/unwind-rs/blob/master/unwind/src/libunwind_shim.rs#L148 if that helps at all.

roblabla commented 5 years ago

Alright so after tracing execution many times in GDB, it seems like a problem un _Unwind_Resume. The unwind_lander somehow ends up restoring the stack right there: https://github.com/gimli-rs/unwind-rs/blob/master/unwind/src/libunwind_shim.rs#L114.

Which obviously causes all sorts of issues. It's supposed to be skipping over this stack, so I guess we're doing something wrong in the code that's skipping over the frames in the resume.

main-- commented 5 years ago

FWIW I bisected this using rustup:

Good: 1.32.0-nightly (4a45578bc 2018-12-07) Bad: 1.32.0-nightly (f4a421ee3 2018-12-13)

Sadly, there are no nightly builds from Dec 08 to Dec 12

I had a very quick skim over the 131 commits between those two but nothing obvious stuck out.

philipc commented 5 years ago

The unwinding all looks fine to me, but, from looking at the MIR/LLVM-IR, rust is generating an abort for the unwind cleanup in _Unwind_RaiseException. Adding #[unwind(allowed)] to _Unwind_RaiseException gets it working. I don't understand this enough yet to know if that's correct, and I haven't found out which rust commit caused this

philipc commented 5 years ago

I don't understand how cleanup of DwarfUnwinder is meant to work. It seems to me that we leak a DwarfUnwinder every time _Unwind_Resume is called, because we skip past the stack frame containing it while looking for private_contptr. Of course, we can't clean it up as part of the unwinding, because the unwinding needs it. Maybe we shouldn't be allocating DwarfUnwinder on the stack?

This is related to _Unwind_RaiseException in that _Unwind_RaiseException doesn't manage to achieve any unwinding... it walks through the frames until it finds itself, discovers it needs to cleanup DwarfUnwinder, stops unwinding and does so, then calls _Unwind_Resume.

(This is based on my memory from a couple of hours ago, I haven't double checked the details.)

main-- commented 5 years ago

Oh of course! This is https://github.com/rust-lang/rust/pull/55982 ! Adding the allow-unwind attribute to all those extern functions is certainly the correct fix.

From what I remember the current implementation just hopes that the unwinding code never reaches high enough on the stack to actually touch the unwinder object. Of course this is bad and should be replaced with a proper solution eventually.

philipc commented 5 years ago

We only need to add the allow-unwind attribute to _Unwind_RaiseException, since it is the only one that we need to allow unwinding for (and we'll need to be careful we support this correctly, currently we don't). None of the rest should ever need unwinding. If they do, it's a bug and the correct behaviour is to abort.

Another option would be to get the context immediately in _Unwind_RaiseException before calling anything else, and skip over everything up to and including it. This would make its operation more like _Unwind_Resume.

main-- commented 5 years ago

None of the rest should ever need unwinding. If they do, it's a bug and the correct behaviour is to abort.

Agreed.

and skip over everything up to and including it

The problem I see with skipping frames is inlining. I would much rather have a single clean cut as we do right now, where everything below that is getting unwound and everything above is not.

philipc commented 5 years ago

I would much rather have a single clean cut as we do right now, where everything below that is getting unwound and everything above is not.

Right. I was misunderstanding things a bit. But still, I think we could change where that single clean cut is, so that it is below the cleanup that aborts. And we could do that by getting the context immediately in _Unwind_RaiseException (which would now need to be an asm function).

philipc commented 5 years ago

Probably should leave this open, because it's going to break on stable rust soon.

main-- commented 5 years ago

I might be missing something here, but why not just modify unwind_tracer to return an option of registers to jump to? Because I just remembered that the reason DwarfUnwinder::trace takes a closure is because everything outside of the closure is affected by unwinding while everything inside is not. In fact, the stack problem I described earlier should not even exist, I think I just forgot that I fixed this already. The only problematic part I see right now is that stack objects of unwind_tracer would be leaked.

philipc commented 5 years ago

why not just modify unwind_tracer to return an option of registers to jump to?

Yep I've been looking into doing that.

everything outside of the closure is affected by unwinding while everything inside is not

This is true for _Unwind_RaiseException but not for _Unwind_Resume, since resume skips some outside the closure.

In fact, the stack problem I described earlier should not even exist, I think I just forgot that I fixed this already.

Not sure I follow. The DwarfUnwinder is outside the closure, so it is affected by unwinding. So what I want to do is move DwarfUnwinder inside the closure.

main-- commented 5 years ago

Not sure I follow. The DwarfUnwinder is outside the closure, so it is affected by unwinding. So what I want to do is move DwarfUnwinder inside the closure.

I don't see how this is possible, considering that the closure is passed to a function on the unwinder object. My point is that because we construct an entirely new unwinder in _Unwind_Resume (as we probably should (?)) it is actually correct to destroy the unwinder object once we have identified a landing pad to jump to. Although - now that I think about it - shouldn't the unwinder's drop function in this case be called by an actual landing pad? Which would then loop infinitely (assuming we stop leaking the unwinder in _Unwind_Resume)? My head hurts.

Basically I'm unsure if the way I implemented _Unwind_Resume is entirely correct. My assumptions right now are that for any destructors to run, the personality function gives us a landing pad that we jump to, which then invokes _Unwind_Resume after doing its cleanup. But I assume we are not supposed to unwind the actual landing pad. Which makes _Unwind_Resume quite unique/awkward in how it operates - basically we have to assume that the stack below _Unwind_Resume is garbage and continue at a known-good point - so exactly what we are doing right now.

But if any of these assumptions are incorrect perhaps we should go for a totally different solution. If possible, I would like to keep _Unwind_RaiseException and _Unwind_Resume as similar as possible, it would be great if we could just get rid of the skip-hack for _Unwind_Resume.

At this point I feel like at least for Rust we can just Box::leak the DwarfUnwinder to avoid dealing with destructors in this code entirely. Last time I checked Rust had a straight abort for OOM and we also get the minor benefit of reusing the unwinder in _Unwind_Resume instead of creating a new one.

philipc commented 5 years ago

Although - now that I think about it - shouldn't the unwinder's drop function in this case be called by an actual landing pad? Which would then loop infinitely (assuming we stop leaking the unwinder in _Unwind_Resume)?

Yes it is, and this is what causes the problem in this issue. Without #[unwind(allowed)], that landing pad will abort. And with #[unwind(allowed)], this means _Unwind_RaiseException accomplishes nothing other than setting private_contptr and calling _Unwind_Resume via the landing pad. If _Unwind_Resume didn't skip frames, then yes this would loop infinitely.