Closed lnicola closed 7 years ago
After booting, do you have any core dump in your journal?
Yes, sorry I missed it. It's crashing with SIGSEGV somewhere around a NULL
pointer: https://i.imgur.com/9XPQ0qS.jpg. Not quite sure why the faulting address is -4
, while the instruction seems to look 8 bytes before rdi
. Could rdi
be 4
?
4172a0: 53 push %rbx
4172a1: 48 83 ec 10 sub $0x10,%rsp
4172a5: 48 8b 05 1c ca 28 00 mov 0x28ca1c(%rip),%rax # 0x6a3cc8
4172ac: 48 85 c0 test %rax,%rax
4172af: 0f 85 8b 00 00 00 jne 0x417340
4172b5: 48 85 ff test %rdi,%rdi
4172b8: 74 7e je 0x417338
4172ba: 48 8b 47 f8 mov -0x8(%rdi),%rax # <------ HERE
4172be: 48 8d 77 f0 lea -0x10(%rdi),%rsi
4172c2: a8 02 test $0x2,%al
4172c4: 75 32 jne 0x4172f8
4172c6: 64 48 83 3c 25 c8 ff cmpq $0x0,%fs:0xffffffffffffffc8
4172cd: ff ff 00
4172d0: 74 7e je 0x417350
4172d2: a8 04 test $0x4,%al
4172d4: 48 8d 3d 05 a5 28 00 lea 0x28a505(%rip),%rdi # 0x6a17e0
4172db: 74 0c je 0x4172e9
4172dd: 48 89 f0 mov %rsi,%rax
4172e0: 48 25 00 00 00 fc and $0xfffffffffc000000,%rax
4172e6: 48 8b 38 mov (%rax),%rdi
4172e9: 48 83 c4 10 add $0x10,%rsp
4172ed: 31 d2 xor %edx,%edx
4172ef: 5b pop %rbx
4172f0: e9 bb c4 ff ff jmpq 0x4137b0
4172f5: 0f 1f 00 nopl (%rax)
4172f8: 8b 15 76 a4 28 00 mov 0x28a476(%rip),%edx # 0x6a1774
4172fe: 85 d2 test %edx,%edx
417300: 75 26 jne 0x417328
417302: 48 3b 05 47 a4 28 00 cmp 0x28a447(%rip),%rax # 0x6a1750
417309: 76 1d jbe 0x417328
41730b: 48 3d 00 00 00 02 cmp $0x2000000,%rax
417311: 77 15 ja 0x417328
417313: 48 83 e0 f8 and $0xfffffffffffffff8,%rax
417317: 48 89 05 32 a4 28 00 mov %rax,0x28a432(%rip) # 0x6a1750
41731e: 48 01 c0 add %rax,%rax
417321: 48 89 05 18 a4 28 00 mov %rax,0x28a418(%rip) # 0x6a1740
417328: 48 83 c4 10 add $0x10,%rsp
41732c: 48 89 f7 mov %rsi,%rdi
41732f: 5b pop %rbx
417330: e9 bb ae ff ff jmpq 0x4121f0
417335: 0f 1f 00 nopl (%rax)
417338: 48 83 c4 10 add $0x10,%rsp
41733c: 5b pop %rbx
41733d: c3 retq
The crashing function is free
, presumably called with a NULL
pointer.
This is what I have in the kernel command line: root=zfs:bike/zroot
. I could try debugging this, but I'll have to change the code a bit or cherry-pick https://github.com/dasJ/sd-zfs/commit/5013a286e8c1ea80fff322717438ee8af1da3fc4.
Wow, I never expected that level of detail. I thing I might have a fix for that, but I need to test that further
@lnicola Can you test 1.0.2? It's pushed to the AUR
Yes, it's working now.
Probably the same issue reported on AUR.
I have two pools, one for the root fs and one for storage. When I drop to the shell on boot, if I run
zpool import -a
, the storage pool gets imported fine, while for the root pool I get an error saying that the host id has changed. I can import it with-f
, but it fails again on the next boot.Also, after the force import, I can boot with the standard
initcpio
with no issues (that is, it doesn't complain about the host id like I was afraid it might).I think I've had issues in the past where I had one host id in the running system and another one in
initcpio
. Possibly related to https://github.com/archzfs/archzfs/commit/0760a006abf9c52fbfd0ea4d07189eb54efc5f43.