Closed JuliaVixen closed 3 years ago
@JuliaVixen According to this error you saw during the zpool import
:
Additional devices are known to be part of this pool, though their exact configuration cannot be determined.
the pool must have had another top-level vdev (i.e. stripe). The import, even with readonly=on
should not have gotten as far as it did but, instead, should have produced the exact same error. Since you're running Gentoo, my guess is that this might be an issue with the new API.
FYI, the pool configuration displayed is not the complete pool configuration, which is only stored on an object within the pool. Instead, it's displaying only the parts of the pool configuration which were actually discovered. The error quoted above does not refer to the missing mirror leg but, instead, refers to a missing top-level vdev.
I plugged the drive back in to get the vdev labels... It kinda looks like there's only supposed to be two drives, if I understand this correctly.
localhost ~ # zdb -lu /dev/sdm
--------------------------------------------
LABEL 0
--------------------------------------------
version: 13
name: 'backup'
state: 1
txg: 16
pool_guid: 3472166890449163768
hostid: 2558674546
hostname: 'vulpis'
top_guid: 7962344807192492976
guid: 15917724193338767979
vdev_tree:
type: 'mirror'
id: 0
guid: 7962344807192492976
metaslab_array: 23
metaslab_shift: 31
ashift: 9
asize: 400083648512
is_log: 0
children[0]:
type: 'disk'
id: 0
guid: 18134448193074441749
path: '/dev/ad16'
whole_disk: 0
children[1]:
type: 'disk'
id: 1
guid: 15917724193338767979
path: '/dev/ada4'
whole_disk: 0
Uberblock[4]
magic = 0000000000bab10c
version = 13
txg = 4
guid_sum = 8593195936635763240
timestamp = 1262001087 UTC = Mon Dec 28 11:51:27 2009
Uberblock[5]
magic = 0000000000bab10c
version = 13
txg = 5
guid_sum = 8593195936635763240
timestamp = 1262001087 UTC = Mon Dec 28 11:51:27 2009
Uberblock[7]
magic = 0000000000bab10c
version = 13
txg = 7
guid_sum = 5830045293826664715
timestamp = 1262001131 UTC = Mon Dec 28 11:52:11 2009
Uberblock[8]
magic = 0000000000bab10c
version = 13
txg = 8
guid_sum = 5830045293826664715
timestamp = 1262001131 UTC = Mon Dec 28 11:52:11 2009
Uberblock[9]
magic = 0000000000bab10c
version = 13
txg = 9
guid_sum = 5830045293826664715
timestamp = 1262001131 UTC = Mon Dec 28 11:52:11 2009
Uberblock[10]
magic = 0000000000bab10c
version = 13
txg = 10
guid_sum = 5830045293826664715
timestamp = 1262001132 UTC = Mon Dec 28 11:52:12 2009
Uberblock[14]
magic = 0000000000bab10c
version = 13
txg = 14
guid_sum = 5830045293826664715
timestamp = 1262001249 UTC = Mon Dec 28 11:54:09 2009
Uberblock[16]
magic = 0000000000bab10c
version = 13
txg = 16
guid_sum = 5830045293826664715
timestamp = 1262001249 UTC = Mon Dec 28 11:54:09 2009
--------------------------------------------
LABEL 1
--------------------------------------------
version: 13
name: 'backup'
state: 1
txg: 16
pool_guid: 3472166890449163768
hostid: 2558674546
hostname: 'vulpis'
top_guid: 7962344807192492976
guid: 15917724193338767979
vdev_tree:
type: 'mirror'
id: 0
guid: 7962344807192492976
metaslab_array: 23
metaslab_shift: 31
ashift: 9
asize: 400083648512
is_log: 0
children[0]:
type: 'disk'
id: 0
guid: 18134448193074441749
path: '/dev/ad16'
whole_disk: 0
children[1]:
type: 'disk'
id: 1
guid: 15917724193338767979
path: '/dev/ada4'
whole_disk: 0
Uberblock[4]
magic = 0000000000bab10c
version = 13
txg = 4
guid_sum = 8593195936635763240
timestamp = 1262001087 UTC = Mon Dec 28 11:51:27 2009
Uberblock[5]
magic = 0000000000bab10c
version = 13
txg = 5
guid_sum = 8593195936635763240
timestamp = 1262001087 UTC = Mon Dec 28 11:51:27 2009
Uberblock[7]
magic = 0000000000bab10c
version = 13
txg = 7
guid_sum = 5830045293826664715
timestamp = 1262001131 UTC = Mon Dec 28 11:52:11 2009
Uberblock[8]
magic = 0000000000bab10c
version = 13
txg = 8
guid_sum = 5830045293826664715
timestamp = 1262001131 UTC = Mon Dec 28 11:52:11 2009
Uberblock[9]
magic = 0000000000bab10c
version = 13
txg = 9
guid_sum = 5830045293826664715
timestamp = 1262001131 UTC = Mon Dec 28 11:52:11 2009
Uberblock[10]
magic = 0000000000bab10c
version = 13
txg = 10
guid_sum = 5830045293826664715
timestamp = 1262001132 UTC = Mon Dec 28 11:52:12 2009
Uberblock[14]
magic = 0000000000bab10c
version = 13
txg = 14
guid_sum = 5830045293826664715
timestamp = 1262001249 UTC = Mon Dec 28 11:54:09 2009
Uberblock[16]
magic = 0000000000bab10c
version = 13
txg = 16
guid_sum = 5830045293826664715
timestamp = 1262001249 UTC = Mon Dec 28 11:54:09 2009
--------------------------------------------
LABEL 2
--------------------------------------------
version: 13
name: 'backup'
state: 1
txg: 16
pool_guid: 3472166890449163768
hostid: 2558674546
hostname: 'vulpis'
top_guid: 7962344807192492976
guid: 15917724193338767979
vdev_tree:
type: 'mirror'
id: 0
guid: 7962344807192492976
metaslab_array: 23
metaslab_shift: 31
ashift: 9
asize: 400083648512
is_log: 0
children[0]:
type: 'disk'
id: 0
guid: 18134448193074441749
path: '/dev/ad16'
whole_disk: 0
children[1]:
type: 'disk'
id: 1
guid: 15917724193338767979
path: '/dev/ada4'
whole_disk: 0
Uberblock[4]
magic = 0000000000bab10c
version = 13
txg = 4
guid_sum = 8593195936635763240
timestamp = 1262001087 UTC = Mon Dec 28 11:51:27 2009
Uberblock[5]
magic = 0000000000bab10c
version = 13
txg = 5
guid_sum = 8593195936635763240
timestamp = 1262001087 UTC = Mon Dec 28 11:51:27 2009
Uberblock[7]
magic = 0000000000bab10c
version = 13
txg = 7
guid_sum = 5830045293826664715
timestamp = 1262001131 UTC = Mon Dec 28 11:52:11 2009
Uberblock[8]
magic = 0000000000bab10c
version = 13
txg = 8
guid_sum = 5830045293826664715
timestamp = 1262001131 UTC = Mon Dec 28 11:52:11 2009
Uberblock[9]
magic = 0000000000bab10c
version = 13
txg = 9
guid_sum = 5830045293826664715
timestamp = 1262001131 UTC = Mon Dec 28 11:52:11 2009
Uberblock[10]
magic = 0000000000bab10c
version = 13
txg = 10
guid_sum = 5830045293826664715
timestamp = 1262001132 UTC = Mon Dec 28 11:52:12 2009
Uberblock[14]
magic = 0000000000bab10c
version = 13
txg = 14
guid_sum = 5830045293826664715
timestamp = 1262001249 UTC = Mon Dec 28 11:54:09 2009
Uberblock[16]
magic = 0000000000bab10c
version = 13
txg = 16
guid_sum = 5830045293826664715
timestamp = 1262001249 UTC = Mon Dec 28 11:54:09 2009
--------------------------------------------
LABEL 3
--------------------------------------------
version: 13
name: 'backup'
state: 1
txg: 16
pool_guid: 3472166890449163768
hostid: 2558674546
hostname: 'vulpis'
top_guid: 7962344807192492976
guid: 15917724193338767979
vdev_tree:
type: 'mirror'
id: 0
guid: 7962344807192492976
metaslab_array: 23
metaslab_shift: 31
ashift: 9
asize: 400083648512
is_log: 0
children[0]:
type: 'disk'
id: 0
guid: 18134448193074441749
path: '/dev/ad16'
whole_disk: 0
children[1]:
type: 'disk'
id: 1
guid: 15917724193338767979
path: '/dev/ada4'
whole_disk: 0
Uberblock[4]
magic = 0000000000bab10c
version = 13
txg = 4
guid_sum = 8593195936635763240
timestamp = 1262001087 UTC = Mon Dec 28 11:51:27 2009
Uberblock[5]
magic = 0000000000bab10c
version = 13
txg = 5
guid_sum = 8593195936635763240
timestamp = 1262001087 UTC = Mon Dec 28 11:51:27 2009
Uberblock[7]
magic = 0000000000bab10c
version = 13
txg = 7
guid_sum = 5830045293826664715
timestamp = 1262001131 UTC = Mon Dec 28 11:52:11 2009
Uberblock[8]
magic = 0000000000bab10c
version = 13
txg = 8
guid_sum = 5830045293826664715
timestamp = 1262001131 UTC = Mon Dec 28 11:52:11 2009
Uberblock[9]
magic = 0000000000bab10c
version = 13
txg = 9
guid_sum = 5830045293826664715
timestamp = 1262001131 UTC = Mon Dec 28 11:52:11 2009
Uberblock[10]
magic = 0000000000bab10c
version = 13
txg = 10
guid_sum = 5830045293826664715
timestamp = 1262001132 UTC = Mon Dec 28 11:52:12 2009
Uberblock[14]
magic = 0000000000bab10c
version = 13
txg = 14
guid_sum = 5830045293826664715
timestamp = 1262001249 UTC = Mon Dec 28 11:54:09 2009
Uberblock[16]
magic = 0000000000bab10c
version = 13
txg = 16
guid_sum = 5830045293826664715
timestamp = 1262001249 UTC = Mon Dec 28 11:54:09 2009
@JuliaVixen Hmm, the label is missing vdev_children
which is only possible for a pool created with a very old version of ZFS. It was added to illumos in illumos/illumos-gate@88ecc94 in the general timeframe of the txgs listed above (3 months earlier). On what OS was this pool created? It doesn't appear that it would contain much of anything; txg 4 is generally the creation txg and the highest txg shown by your labels is 16. With only 12 txgs ever applied to the pool, it's hard to believe there's much interesting in it nor that it was ever even imported on a live system for very long.
This pool was created on FreeBSD release, um... I forgot, 7.something I think? (I still have the root fs drive, so I could theoretically check.) In 2009, I probably plugged this drive in (with a second drive too), created a zpool mirror named "backup", backed some stuff up to it, unplugged the drives, and hid them away (apparently in different locations) until I found one of them again in 2016.
I'm pretty sure there's data on it, I just can't get any information about that from zdb or any other tools.
I've probably only imported this once or twice, in 2009, just to copy data to it.
I agree with dweezil, the pool was only imported for around 3m18s and there should be another missing disk above and beyond the ad16
listed. There may be data on it but not much at all.
Allowing the import to proceed with a known missing top-level vdev just to promptly crash is a bug though.
Here's another interesting item shown by the vdev label: the guid_sum
changed between txg 5 and txg 7 which means the pool configuration was changed early on.
I manually ran the vdev checksum calculation and it definitely doesn't match any of the uberblocks. @JuliaVixen I don't know much about the various versions of ZoL available for Gentoo but this still sounds like a regression introduced by the stable ABI as I understand is used on some Gentoo systems. The import process shouldn't get this far and as @DeHackEd pointed out, that's the real bug here. Can you try with a more stock build of ZoL, assuming you are in fact using a version with the new stable API?
The zfs-9999, spl-9999, zfs-kmod-9999 ebuilds will have the current master without the stable API patches.
[Backup 30T of data...]
localhost ~ # echo "sys-kernel/spl ~amd64" >> /etc/portage/package.accept_keywords
localhost ~ # echo "sys-fs/zfs-kmod ~amd64" >> /etc/portage/package.accept_keywords
localhost ~ # echo "sys-fs/zfs ~amd64" >> /etc/portage/package.accept_keywords
localhost ~ # echo "=sys-kernel/spl-9999 **" >> /etc/portage/package.accept_keywords
localhost ~ # echo "=sys-fs/zfs-kmod-9999 **" >> /etc/portage/package.accept_keywords
localhost ~ # echo "=sys-fs/zfs-9999 **" >> /etc/portage/package.accept_keywords
localhost ~ # emerge =zfs-9999
[Installation of 50 packages later...]
[Reboot just to make sure the old kernel modules didn't stick around...]
[Plug old drive back in...]
localhost ~ # zdb -l /dev/sdg
--------------------------------------------
LABEL 0
--------------------------------------------
version: 13
name: 'backup'
state: 1
txg: 16
pool_guid: 3472166890449163768
hostid: 2558674546
[etc...]
localhost ~ # zpool import
no pools available to import
localhost ~ # zpool import -D
no pools available to import
localhost ~ # zpool import -o readonly=on backup
cannot import 'backup': no such pool available
localhost ~ # mkdir tmptmp
localhost ~ # cp -ai /dev/sdg* tmptmp
localhost ~ # zpool import -d tmptmp
pool: backup
id: 3472166890449163768
state: UNAVAIL
status: One or more devices are missing from the system.
action: The pool cannot be imported. Attach the missing
devices and try again.
see: http://zfsonlinux.org/msg/ZFS-8000-6X
config:
backup UNAVAIL missing device
mirror-0 DEGRADED
ad16 UNAVAIL
sdg ONLINE
Additional devices are known to be part of this pool, though their
exact configuration cannot be determined.
localhost ~ # zpool import -d tmptmp -o readonly=on backup
[ 962.631909] PANIC: blkptr at ffff880807a7d048 DVA 1 has invalid VDEV 1
[ 962.632014] Showing stack for process 4802
[ 962.632016] CPU: 0 PID: 4802 Comm: zpool Tainted: P O 4.4.6-gentoo #1
[ 962.632016] Hardware name: Supermicro X10SLL-F/X10SLL-SF/X10SLL-F/X10SLL-SF, BIOS 1.0a 06/11/2013
[ 962.632018] 0000000000000000 ffff8808048f3798 ffffffff8130feed 0000000000000003
[ 962.632019] 0000000000000001 ffff8808048f37a8 ffffffffa0a32239 ffff8808048f38d8
[ 962.632021] ffffffffa0a322d1 ffff8808048f3818 ffffffff814cf51c 61207274706b6c62
[ 962.632022] Call Trace:
[ 962.632026] [<ffffffff8130feed>] dump_stack+0x4d/0x64
[ 962.632031] [<ffffffffa0a32239>] spl_dumpstack+0x3d/0x3f [spl]
[ 962.632033] [<ffffffffa0a322d1>] vcmn_err+0x96/0xd3 [spl]
[ 962.632036] [<ffffffff814cf51c>] ? __schedule+0x5fe/0x744
[ 962.632038] [<ffffffff814cf75d>] ? schedule+0x72/0x81
[ 962.632039] [<ffffffff814d2048>] ? schedule_timeout+0x24/0x15b
[ 962.632042] [<ffffffff8105b4a0>] ? __sched_setscheduler+0x56a/0x73f
[ 962.632045] [<ffffffff810e9ac7>] ? cache_alloc_refill+0x69/0x4a3
[ 962.632059] [<ffffffffa0ef0936>] zfs_panic_recover+0x4d/0x4f [zfs]
[ 962.632064] [<ffffffffa0ea18ca>] ? arc_space_return+0x13e9/0x24ba [zfs]
[ 962.632071] [<ffffffffa0f2eb5a>] zfs_blkptr_verify+0x248/0x2a1 [zfs]
[ 962.632079] [<ffffffffa0f2ebe7>] zio_read+0x34/0x147 [zfs]
[ 962.632084] [<ffffffffa0ea18ca>] ? arc_space_return+0x13e9/0x24ba [zfs]
[ 962.632089] [<ffffffffa0ea3731>] arc_read+0x817/0x85d [zfs]
[ 962.632097] [<ffffffffa0eb372f>] dmu_objset_open_impl+0xf0/0x65d [zfs]
[ 962.632098] [<ffffffff810685cc>] ? add_wait_queue+0x44/0x44
[ 962.632109] [<ffffffffa0ecead8>] dsl_pool_init+0x2d/0x50 [zfs]
[ 962.632120] [<ffffffffa0ee823f>] spa_vdev_remove+0xb5d/0x2220 [zfs]
[ 962.632122] [<ffffffffa0a30e9a>] ? taskq_create+0x30e/0x6a5 [spl]
[ 962.632124] [<ffffffff810685cc>] ? add_wait_queue+0x44/0x44
[ 962.632126] [<ffffffffa0a897d2>] ? nvlist_remove_nvpair+0x2a8/0x314 [znvpair]
[ 962.632129] [<ffffffffa0ae8e68>] ? zpool_get_rewind_policy+0x116/0x13c [zcommon]
[ 962.632140] [<ffffffffa0ee93ee>] spa_vdev_remove+0x1d0c/0x2220 [zfs]
[ 962.632141] [<ffffffffa0ae8dda>] ? zpool_get_rewind_policy+0x88/0x13c [zcommon]
[ 962.632152] [<ffffffffa0ee9f04>] spa_import+0x191/0x640 [zfs]
[ 962.632161] [<ffffffffa0f115cc>] zfs_secpolicy_smb_acl+0x1485/0x42ac [zfs]
[ 962.632169] [<ffffffffa0f1609f>] pool_status_check+0x3b4/0x48f [zfs]
[ 962.632171] [<ffffffff810fca17>] do_vfs_ioctl+0x3f5/0x43d
[ 962.632173] [<ffffffff810368f2>] ? __do_page_fault+0x24e/0x367
[ 962.632174] [<ffffffff810fca98>] SyS_ioctl+0x39/0x61
[ 962.632176] [<ffffffff814d2b57>] entry_SYSCALL_64_fastpath+0x12/0x6a
[In other terminal]
localhost ~ # ps auwx | grep zpool
root 4802 0.0 0.0 566452 4852 pts/0 D+ 23:43 0:00 zpool import -d tmptmp -o readonly on backup
[This is the exact GIT stuff I'm currently running]
localhost ~ # grep . /usr/portage/distfiles/git3-src/*/*HEAD*
/usr/portage/distfiles/git3-src/zfsonlinux_spl.git/FETCH_HEAD:39cd90ef08bb6817dd57ac08e9de5c87af2681ed https://github.com/zfsonlinux/spl
/usr/portage/distfiles/git3-src/zfsonlinux_spl.git/FETCH_HEAD:72e7de60262b8a1925e0a384a76cc1d745ea310e not-for-merge tag 'spl-0.4.0' of https://github.com/zfsonlinux/spl
/usr/portage/distfiles/git3-src/zfsonlinux_spl.git/FETCH_HEAD:60844c45530c731a83594162599181ab70ee3b6c not-for-merge tag 'spl-0.4.1' of https://github.com/zfsonlinux/spl
/usr/portage/distfiles/git3-src/zfsonlinux_spl.git/FETCH_HEAD:28d9aa198a32e3b683e741792436b69ead16de2e not-for-merge tag 'spl-0.4.2' of https://github.com/zfsonlinux/spl
/usr/portage/distfiles/git3-src/zfsonlinux_spl.git/FETCH_HEAD:17221eb570dea0bd581766a79656ad4c713ec759 not-for-merge tag 'spl-0.4.3' of https://github.com/zfsonlinux/spl
/usr/portage/distfiles/git3-src/zfsonlinux_spl.git/FETCH_HEAD:46a1d3fe02e7a59242b417e12332b871285ecb2d not-for-merge tag 'spl-0.4.4' of https://github.com/zfsonlinux/spl
/usr/portage/distfiles/git3-src/zfsonlinux_spl.git/FETCH_HEAD:f264e472ef078b51e4856218f456e0699a4dbd62 not-for-merge tag 'spl-0.4.5' of https://github.com/zfsonlinux/spl
/usr/portage/distfiles/git3-src/zfsonlinux_spl.git/FETCH_HEAD:6269bb7267809e221e4d63219e0960f8f6d71251 not-for-merge tag 'spl-0.4.6' of https://github.com/zfsonlinux/spl
/usr/portage/distfiles/git3-src/zfsonlinux_spl.git/FETCH_HEAD:3dbd424037e7dcdf74f7b2caa822b840f94a6cca not-for-merge tag 'spl-0.4.7' of https://github.com/zfsonlinux/spl
/usr/portage/distfiles/git3-src/zfsonlinux_spl.git/FETCH_HEAD:e86467fcf952b2734ee9939ef48252437a85baea not-for-merge tag 'spl-0.4.8' of https://github.com/zfsonlinux/spl
/usr/portage/distfiles/git3-src/zfsonlinux_spl.git/FETCH_HEAD:5c04498004a2a00f3ccc2542cc11a3e9902a304d not-for-merge tag 'spl-0.4.9' of https://github.com/zfsonlinux/spl
/usr/portage/distfiles/git3-src/zfsonlinux_spl.git/FETCH_HEAD:55d0582828402420da093e2a9002c2941f43bc3e not-for-merge tag 'spl-0.5.0' of https://github.com/zfsonlinux/spl
/usr/portage/distfiles/git3-src/zfsonlinux_spl.git/FETCH_HEAD:5580c16c869ceed5c4c16280b87ccefcb966950e not-for-merge tag 'spl-0.5.1' of https://github.com/zfsonlinux/spl
/usr/portage/distfiles/git3-src/zfsonlinux_spl.git/FETCH_HEAD:48eea97fed64c0fd6bac0dbf94788d11d69aac47 not-for-merge tag 'spl-0.5.2' of https://github.com/zfsonlinux/spl
/usr/portage/distfiles/git3-src/zfsonlinux_spl.git/FETCH_HEAD:7b85a4549f9624f69e52085f9ba72ec0845ec4e4 not-for-merge tag 'spl-0.6.0-rc1' of https://github.com/zfsonlinux/spl
/usr/portage/distfiles/git3-src/zfsonlinux_spl.git/FETCH_HEAD:64152b9093ae98187d97051e47e8de37f6aeac4b not-for-merge tag 'spl-0.6.0-rc10' of https://github.com/zfsonlinux/spl
/usr/portage/distfiles/git3-src/zfsonlinux_spl.git/FETCH_HEAD:bbcd7e15c58bcc2c3ceeb031a72e03cacfcd27b5 not-for-merge tag 'spl-0.6.0-rc11' of https://github.com/zfsonlinux/spl
/usr/portage/distfiles/git3-src/zfsonlinux_spl.git/FETCH_HEAD:1f49fc87fbea81d8e390b806f0499d6b633e5e2d not-for-merge tag 'spl-0.6.0-rc12' of https://github.com/zfsonlinux/spl
/usr/portage/distfiles/git3-src/zfsonlinux_spl.git/FETCH_HEAD:1cad00e56fcb7a6856ca84ffff4c3de17bcac6d4 not-for-merge tag 'spl-0.6.0-rc13' of https://github.com/zfsonlinux/spl
/usr/portage/distfiles/git3-src/zfsonlinux_spl.git/FETCH_HEAD:085913f746de3ccd1703203b452efc7a2b6b77ad not-for-merge tag 'spl-0.6.0-rc14' of https://github.com/zfsonlinux/spl
/usr/portage/distfiles/git3-src/zfsonlinux_spl.git/FETCH_HEAD:de69baa886347682efd53ecfae8a3b02fd12b60f not-for-merge tag 'spl-0.6.0-rc2' of https://github.com/zfsonlinux/spl
/usr/portage/distfiles/git3-src/zfsonlinux_spl.git/FETCH_HEAD:70a9fde629fc58d867f9ffd3abea773b77b4b370 not-for-merge tag 'spl-0.6.0-rc3' of https://github.com/zfsonlinux/spl
/usr/portage/distfiles/git3-src/zfsonlinux_spl.git/FETCH_HEAD:66d15f50659734e42876347ffc60a38899dd631c not-for-merge tag 'spl-0.6.0-rc4' of https://github.com/zfsonlinux/spl
/usr/portage/distfiles/git3-src/zfsonlinux_spl.git/FETCH_HEAD:bd8c888627201fbbbe3f5f031b4c199a4f374587 not-for-merge tag 'spl-0.6.0-rc5' of https://github.com/zfsonlinux/spl
/usr/portage/distfiles/git3-src/zfsonlinux_spl.git/FETCH_HEAD:133c515e9962b00a391d360a19d6c911df793d21 not-for-merge tag 'spl-0.6.0-rc6' of https://github.com/zfsonlinux/spl
/usr/portage/distfiles/git3-src/zfsonlinux_spl.git/FETCH_HEAD:658fb5fa08f10703b4d1861982dfc0d44da15db9 not-for-merge tag 'spl-0.6.0-rc7' of https://github.com/zfsonlinux/spl
/usr/portage/distfiles/git3-src/zfsonlinux_spl.git/FETCH_HEAD:55d3957b3fdee19e6e65f851fdd83f8874d856bb not-for-merge tag 'spl-0.6.0-rc8' of https://github.com/zfsonlinux/spl
/usr/portage/distfiles/git3-src/zfsonlinux_spl.git/FETCH_HEAD:b9910c44f097a5387cad97b003e8b6d7403381c9 not-for-merge tag 'spl-0.6.0-rc9' of https://github.com/zfsonlinux/spl
/usr/portage/distfiles/git3-src/zfsonlinux_spl.git/FETCH_HEAD:db4237ae0bc3fcb698c6f30962b2a67c03d2e1d7 not-for-merge tag 'spl-0.6.1' of https://github.com/zfsonlinux/spl
/usr/portage/distfiles/git3-src/zfsonlinux_spl.git/FETCH_HEAD:3395c5fa0533ae7fc6ae89ba314d2685e6feef37 not-for-merge tag 'spl-0.6.2' of https://github.com/zfsonlinux/spl
/usr/portage/distfiles/git3-src/zfsonlinux_spl.git/FETCH_HEAD:6ac27d4655cabe8b12193a8ae3378efa3c6d0537 not-for-merge tag 'spl-0.6.3' of https://github.com/zfsonlinux/spl
/usr/portage/distfiles/git3-src/zfsonlinux_spl.git/FETCH_HEAD:5aafddabeae97971e05bd6f592e19c04857bf4f2 not-for-merge tag 'spl-0.6.4' of https://github.com/zfsonlinux/spl
/usr/portage/distfiles/git3-src/zfsonlinux_spl.git/FETCH_HEAD:5882cac9e11b686172a1e396888aa867c48eb0fc not-for-merge tag 'spl-0.6.5' of https://github.com/zfsonlinux/spl
/usr/portage/distfiles/git3-src/zfsonlinux_spl.git/HEAD:ref: refs/heads/master
/usr/portage/distfiles/git3-src/zfsonlinux_zfs-images.git/FETCH_HEAD:3331601f6dc50ef2c9779c1656218701b48b276c branch 'master' of https://github.com/zfsonlinux/zfs-images
/usr/portage/distfiles/git3-src/zfsonlinux_zfs-images.git/FETCH_HEAD:3331601f6dc50ef2c9779c1656218701b48b276c https://github.com/zfsonlinux/zfs-images
/usr/portage/distfiles/git3-src/zfsonlinux_zfs-images.git/HEAD:ref: refs/heads/master
/usr/portage/distfiles/git3-src/zfsonlinux_zfs.git/FETCH_HEAD:bc2d809387debb95d82f47185d446f328da4d147 https://github.com/zfsonlinux/zfs
/usr/portage/distfiles/git3-src/zfsonlinux_zfs.git/HEAD:ref: refs/heads/master
localhost ~ # modinfo zfs | head
filename: /lib/modules/4.4.6-gentoo/extra/zfs/zfs.ko
version: 0.6.5-281_gbc2d809
license: CDDL
author: OpenZFS on Linux
description: ZFS
srcversion: 7B91AC3C57C8823BFB45845
depends: spl,znvpair,zunicode,zcommon,zavl
vermagic: 4.4.6-gentoo SMP mod_unload modversions
parm: zvol_inhibit_dev:Do not create zvol device nodes (uint)
parm: zvol_major:Major number for zvol device (uint)
localhost ~ # modinfo spl | head
filename: /lib/modules/4.4.6-gentoo/extra/spl/spl.ko
version: 0.6.5-54_g39cd90e
license: GPL
author: OpenZFS on Linux
description: Solaris Porting Layer
srcversion: F46FA65506ED2842A5834D8
depends: zlib_deflate
vermagic: 4.4.6-gentoo SMP mod_unload modversions
parm: spl_hostid:The system hostid. (ulong)
parm: spl_hostid_path:The system hostid file (/etc/hostid) (charp)
localhost ~ # ls -l /lib/modules/4.4.6-gentoo/extra/zfs/zfs.ko
-rw-r--r-- 1 root root 1704200 May 18 23:24 /lib/modules/4.4.6-gentoo/extra/zfs/zfs.ko
localhost ~ # strings /lib/modules/4.4.6-gentoo/extra/zfs/zfs.ko | grep 9999
/var/tmp/portage/sys-fs/zfs-kmod-9999/work/zfs-kmod-9999/module/zfs/arc.c
/var/tmp/portage/sys-fs/zfs-kmod-9999/work/zfs-kmod-9999/module/zfs/bplist.c
/var/tmp/portage/sys-fs/zfs-kmod-9999/work/zfs-kmod-9999/module/zfs/bpobj.c
/var/tmp/portage/sys-fs/zfs-kmod-9999/work/zfs-kmod-9999/module/zfs/dbuf.c
[etc. etc. It's the current package.]
Oh hey guess what! I just got this crash again with a drive from my old Solaris box. I know for certain that there is no missing VDEV this time; I have all of the drives plugged in.
At the console:
localhost ~ # zpool import -o readonly=on unmirrored
PANIC: blkptr at ffff880054b92048 DVA 1 has invalid VDEV 1
In the log:
[73663.722473] PANIC: blkptr at ffff880054b92048 DVA 1 has invalid VDEV 1
[73663.722576] Showing stack for process 16033
[73663.722578] CPU: 1 PID: 16033 Comm: zpool Tainted: P O 4.4.6-gentoo #1
[73663.722578] Hardware name: Supermicro X10SLL-F/X10SLL-SF/X10SLL-F/X10SLL-SF, BIOS 1.0a 06/11/2013
[73663.722580] 0000000000000000 ffff88043e437798 ffffffff8130feed 0000000000000003
[73663.722581] 0000000000000001 ffff88043e4377a8 ffffffffa0a9a239 ffff88043e4378d8
[73663.722583] ffffffffa0a9a2d1 ffff88043e437818 ffffffff814cf51c 61207274706b6c62
[73663.722584] Call Trace:
[73663.722589] [<ffffffff8130feed>] dump_stack+0x4d/0x64
[73663.722594] [<ffffffffa0a9a239>] spl_dumpstack+0x3d/0x3f [spl]
[73663.722596] [<ffffffffa0a9a2d1>] vcmn_err+0x96/0xd3 [spl]
[73663.722598] [<ffffffff814cf51c>] ? __schedule+0x5fe/0x744
[73663.722600] [<ffffffff814cf75d>] ? schedule+0x72/0x81
[73663.722602] [<ffffffff814d2048>] ? schedule_timeout+0x24/0x15b
[73663.722605] [<ffffffff8105b4a0>] ? __sched_setscheduler+0x56a/0x73f
[73663.722608] [<ffffffff810e9ac7>] ? cache_alloc_refill+0x69/0x4a3
[73663.722623] [<ffffffffa1000936>] zfs_panic_recover+0x4d/0x4f [zfs]
[73663.722628] [<ffffffffa0fb18ca>] ? arc_space_return+0x13e9/0x24ba [zfs]
[73663.722636] [<ffffffffa103eb5a>] zfs_blkptr_verify+0x248/0x2a1 [zfs]
[73663.722643] [<ffffffffa103ebe7>] zio_read+0x34/0x147 [zfs]
[73663.722648] [<ffffffffa0fb18ca>] ? arc_space_return+0x13e9/0x24ba [zfs]
[73663.722653] [<ffffffffa0fb3731>] arc_read+0x817/0x85d [zfs]
[73663.722661] [<ffffffffa0fc372f>] dmu_objset_open_impl+0xf0/0x65d [zfs]
[73663.722663] [<ffffffff810685cc>] ? add_wait_queue+0x44/0x44
[73663.722673] [<ffffffffa0fdead8>] dsl_pool_init+0x2d/0x50 [zfs]
[73663.722685] [<ffffffffa0ff823f>] spa_vdev_remove+0xb5d/0x2220 [zfs]
[73663.722687] [<ffffffffa0a98e9a>] ? taskq_create+0x30e/0x6a5 [spl]
[73663.722688] [<ffffffff810685cc>] ? add_wait_queue+0x44/0x44
[73663.722691] [<ffffffffa0aac7d2>] ? nvlist_remove_nvpair+0x2a8/0x314 [znvpair]
[73663.722693] [<ffffffffa0ac0e68>] ? zpool_get_rewind_policy+0x116/0x13c [zcommon]
[73663.722704] [<ffffffffa0ff93ee>] spa_vdev_remove+0x1d0c/0x2220 [zfs]
[73663.722706] [<ffffffffa0ac0dda>] ? zpool_get_rewind_policy+0x88/0x13c [zcommon]
[73663.722717] [<ffffffffa0ff9f04>] spa_import+0x191/0x640 [zfs]
[73663.722726] [<ffffffffa10215cc>] zfs_secpolicy_smb_acl+0x1485/0x42ac [zfs]
[73663.722734] [<ffffffffa102609f>] pool_status_check+0x3b4/0x48f [zfs]
[73663.722736] [<ffffffff810fca17>] do_vfs_ioctl+0x3f5/0x43d
[73663.722737] [<ffffffff810fca98>] SyS_ioctl+0x39/0x61
[73663.722739] [<ffffffff814d2b57>] entry_SYSCALL_64_fastpath+0x12/0x6a
Yep, it's blocked...
root 16033 0.0 0.0 47064 4592 tty4 D+ 01:47 0:00 zpool import -o readonly on unmirrored
Here's the disk label
localhost ~ # zdb -l /dev/sdl2
--------------------------------------------
LABEL 0
--------------------------------------------
version: 2
name: 'unmirrored'
state: 0
txg: 9297562
pool_guid: 17787833881665718298
top_guid: 13027374485316129798
guid: 13027374485316129798
vdev_tree:
type: 'disk'
id: 0
guid: 13027374485316129798
path: '/dev/dsk/c1d0p2'
devid: 'id1,cmdk@AST3750640AS=____________3QD06J7R/s'
whole_disk: 0
metaslab_array: 13
metaslab_shift: 32
ashift: 9
asize: 674616049664
DTL: 184
--------------------------------------------
LABEL 1
--------------------------------------------
version: 2
name: 'unmirrored'
state: 0
txg: 9297562
pool_guid: 17787833881665718298
top_guid: 13027374485316129798
guid: 13027374485316129798
vdev_tree:
type: 'disk'
id: 0
guid: 13027374485316129798
path: '/dev/dsk/c1d0p2'
devid: 'id1,cmdk@AST3750640AS=____________3QD06J7R/s'
whole_disk: 0
metaslab_array: 13
metaslab_shift: 32
ashift: 9
asize: 674616049664
DTL: 184
--------------------------------------------
LABEL 2
--------------------------------------------
version: 2
name: 'unmirrored'
state: 0
txg: 9297562
pool_guid: 17787833881665718298
top_guid: 13027374485316129798
guid: 13027374485316129798
vdev_tree:
type: 'disk'
id: 0
guid: 13027374485316129798
path: '/dev/dsk/c1d0p2'
devid: 'id1,cmdk@AST3750640AS=____________3QD06J7R/s'
whole_disk: 0
metaslab_array: 13
metaslab_shift: 32
ashift: 9
asize: 674616049664
DTL: 184
--------------------------------------------
LABEL 3
--------------------------------------------
version: 2
name: 'unmirrored'
state: 0
txg: 9297562
pool_guid: 17787833881665718298
top_guid: 13027374485316129798
guid: 13027374485316129798
vdev_tree:
type: 'disk'
id: 0
guid: 13027374485316129798
path: '/dev/dsk/c1d0p2'
devid: 'id1,cmdk@AST3750640AS=____________3QD06J7R/s'
whole_disk: 0
metaslab_array: 13
metaslab_shift: 32
ashift: 9
asize: 674616049664
DTL: 184
localhost ~ # hdparm -i /dev/sdl
/dev/sdl:
Model=ST3750640AS, FwRev=3.AAC, SerialNo=3QD06J7R
Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs RotSpdTol>.5% }
RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=4
BuffType=unknown, BuffSize=16384kB, MaxMultSect=16, MultSect=off
CurCHS=16383/16/63, CurSects=16514064, LBA=yes, LBAsects=1465149168
IORDY=on/off, tPIO={min:120,w/IORDY:120}, tDMA={min:120,rec:120}
PIO modes: pio0 pio1 pio2 pio3 pio4
DMA modes: mdma0 mdma1 mdma2
UDMA modes: udma0 udma1 udma2 udma3 udma4 udma5 *udma6
AdvancedPM=no WriteCache=enabled
Drive conforms to: Unspecified: ATA/ATAPI-1,2,3,4,5,6,7
* signifies the current active mode
This disk was the boot drive on my old Solaris 10 box... SunOS 5.10 Generic 118855-14 May 2006 Solaris 10 6/06 s10x_u2wos_09a X86
Jul 1 01:41:18 localhost kernel: scsi 1:0:0:0: Direct-Access ATA ST3750640AS C PQ: 0 ANSI: 5
Jul 1 01:41:18 localhost kernel: sd 1:0:0:0: [sdl] 1465149168 512-byte logical blocks: (750 GB/699 GiB)
Jul 1 01:41:18 localhost kernel: sd 1:0:0:0: Attached scsi generic sg12 type 0
Jul 1 01:41:18 localhost kernel: sd 1:0:0:0: [sdl] Write Protect is off
Jul 1 01:41:18 localhost kernel: sd 1:0:0:0: [sdl] Mode Sense: 00 3a 00 00
Jul 1 01:41:18 localhost kernel: sd 1:0:0:0: [sdl] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Jul 1 01:41:18 localhost kernel: sdl: sdl1 sdl2
Jul 1 01:41:18 localhost kernel: sdl1: <solaris: [s0] sdl5 [s1] sdl6 [s2] sdl7 [s7] sdl8 [s8] sdl9 [s9] sdl10 >
Jul 1 01:41:18 localhost kernel: sd 1:0:0:0: [sdl] Attached SCSI disk
localhost ~ # file -s /dev/sdl5
/dev/sdl5: Unix Fast File system [v1] (little-endian), last mounted on /, last written at Fri Aug 28 18:45:31 2009, clean flag 253, number of blocks 5092605, number of data blocks 5015211, number of cylinder groups 104, block size 8192, fragment size 1024, minimum percentage of free blocks 1, rotational delay 0ms, disk rotational speed 60rps, TIME optimization
localhost ~ # mount -r /dev/sdl5 /mnt/temp/
localhost ~ # ls -a /mnt/temp
. .TTauthority .bash_history .dtprofile .gconfd .iiim .softwareupdate .sunw bin cdrom dev etc format.dat kernel lost+found net opt proc system unmirrored var
.. .Xauthority .dt .gconf .gstreamer-0.8 .smc.properties .ssh TT_DB boot data devices export home lib mnt noautoshutdown platform sbin tmp usr vol
@JuliaVixen ZFS is detecting a DVA (data virtual address) which appears to be damaged because it refers to a vdev which can't exist according to the pool configuration. Since the drive is from a very old Solaris system and ZFS version (pool version 2!), and because it's not the primary DVA I suspect it's just not being detected on the old setup. You could try setting the zfs_recover=1
module option to make the error non-fatal and importing the pool read-only.
I haven't tried the zfs_recover=1
option yet. (Sorry, been busy with other stuff.) But while I had this FreeBSD 11.0RC3 system up, I figured I'd try importing this drive on that, just to see what happens....
Well... FreeBSD has a kernel panic too.
panic: Solaris(panic): blkptr at 0xfffff80116bcf848 DVA 1 has invalid VDEV 1
cpuid=8
KDB: stack backtrace:
[stuff...]
zfs_blkptr_verify()
appears to be where this panic gets thrown.
Ok, I booted with zfs.zfs_recover=1
and, typed zpool import
in a terminal... No crash yet...
localhost ~ # zpool import
pool: unmirrored
id: 17787833881665718298
state: UNAVAIL
status: One or more devices are missing from the system.
action: The pool cannot be imported. Attach the missing
devices and try again.
see: http://zfsonlinux.org/msg/ZFS-8000-6X
config:
unmirrored UNAVAIL missing device
sdg2 ONLINE
Additional devices are known to be part of this pool, though their
exact configuration cannot be determined.
Then I typed zpool import -o readonly=on unmirrored
, and that caused a kernel panic...
[ 75.602065] SPL: using hostid 0x00000000
[ 103.700162] WARNING: blkptr at ffff880802306048 DVA 1 has invalid VDEV 1
[ 103.715520] WARNING: blkptr at ffff880802880c40 DVA 1 has invalid VDEV 1
[ 103.715771] WARNING: blkptr at ffff8808027e0000 DVA 1 has invalid VDEV 1
[ 103.716003] WARNING: blkptr at ffff8808027d4240 DVA 0 has invalid VDEV 1
[ 103.746942] WARNING: blkptr at ffff8808027d5640 DVA 0 has invalid VDEV 1
[ 103.766755] WARNING: blkptr at ffff8808027e0080 DVA 1 has invalid VDEV 1
[ 103.785574] WARNING: blkptr at ffff8808027e0100 DVA 1 has invalid VDEV 1
[ 103.785633] WARNING: blkptr at ffff8808023436f8 DVA 1 has invalid VDEV 1
[ 103.786688] WARNING: blkptr at ffff8808023436f8 DVA 1 has invalid VDEV 1
[ 103.786741] WARNING: blkptr at ffff8808023436f8 DVA 1 has invalid VDEV 1
[ 103.828517] WARNING: blkptr at ffff8808023436f8 DVA 1 has invalid VDEV 1
[ 103.828578] WARNING: blkptr at ffff8808023436f8 DVA 1 has invalid VDEV 1
[ 103.828621] WARNING: blkptr at ffff8808023436f8 DVA 1 has invalid VDEV 1
[ 103.867807] WARNING: blkptr at ffff8808023436f8 DVA 1 has invalid VDEV 1
[ 103.867871] WARNING: blkptr at ffff8808023436f8 DVA 1 has invalid VDEV 1
[ 103.867914] WARNING: blkptr at ffff8808023436f8 DVA 1 has invalid VDEV 1
[ 103.889641] VERIFY3(rvd->vdev_children == mrvd->vdev_children) failed (1 == 2)
[ 103.889728] PANIC at spa.c:1717:spa_config_valid()
[ 103.889804] Showing stack for process 5396
[ 103.889806] CPU: 2 PID: 5396 Comm: zpool Tainted: P O 4.6.7B #6
[ 103.889807] Hardware name: Supermicro X10SLL-F/X10SLL-SF/X10SLL-F/X10SLL-SF, BIOS 1.0a 06/11/2013
[ 103.889808] 0000000000000000 ffff8808023439e8 ffffffff813cc8d6 ffffffffc1359822
[ 103.889810] ffffffffc13597f4 ffff8808023439f8 ffffffffc11469e5 ffff880802343b98
[ 103.889812] ffffffffc1146bda 0000000000000000 ffff880801036000 ffff880000000030
[ 103.889814] Call Trace:
[ 103.889819] [<ffffffff813cc8d6>] dump_stack+0x4d/0x63
[ 103.889828] [<ffffffffc11469e5>] spl_dumpstack+0x3d/0x3f [spl]
[ 103.889832] [<ffffffffc1146bda>] spl_panic+0xb8/0xf6 [spl]
[ 103.889872] [<ffffffffc126577b>] ? spa_config_parse+0x22/0x100 [zfs]
[ 103.889878] [<ffffffffc1173ae8>] ? nvlist_lookup_common+0x6b/0x8d [znvpair]
[ 103.889909] [<ffffffffc126a8c1>] spa_load+0x1386/0x1c7a [zfs]
[ 103.889914] [<ffffffffc11960d8>] ? zpool_get_rewind_policy+0x116/0x13c [zcommon]
[ 103.889946] [<ffffffffc126b21e>] spa_load_best+0x69/0x251 [zfs]
[ 103.889949] [<ffffffffc119604a>] ? zpool_get_rewind_policy+0x88/0x13c [zcommon]
[ 103.889982] [<ffffffffc126bdf6>] spa_import+0x19f/0x653 [zfs]
[ 103.890019] [<ffffffffc12a5827>] zfs_ioc_pool_import+0xaf/0xec [zfs]
[ 103.890056] [<ffffffffc12ab0d9>] zfsdev_ioctl+0x40e/0x521 [zfs]
[ 103.890059] [<ffffffff811547d4>] vfs_ioctl+0x1c/0x2f
[ 103.890060] [<ffffffff81154e46>] do_vfs_ioctl+0x5cb/0x60e
[ 103.890061] [<ffffffff81154ec2>] SyS_ioctl+0x39/0x61
[ 103.890064] [<ffffffff81603a5f>] entry_SYSCALL_64_fastpath+0x17/0x93
While clearing out the rest of the stuff from my garage, I found the other half of the backup
pool. Plugging both drives in, I can zfs import
without error, everything is working fine....
So, I guess to reproduce this bug, I should remove a disk. Anyway, while I have it working, here's a dump of what a "working" configuration looks like...
If I just do zpool import
, it will only see /dev/sdi
, and ignore the other drives for some unknown reason. If I do zpool import -d tmp_devs
, then it will see all the drives in the pool, for some unknown reason.
[The partitions are left over from whatever this drive was used as before, they're not really valid partitions, but I didn't zero out the disk before I used it in this ZFS pool.]
localhost ~ # mkdir foo
localhost ~ # cp -avi /dev/sdi* /dev/sdj* foo/
'/dev/sdi' -> 'foo/sdi'
'/dev/sdj' -> 'foo/sdj'
'/dev/sdj1' -> 'foo/sdj1'
'/dev/sdj2' -> 'foo/sdj2'
'/dev/sdj3' -> 'foo/sdj3'
'/dev/sdj4' -> 'foo/sdj4'
localhost ~ # zpool import -d foo
pool: backup
id: 3472166890449163768
state: DEGRADED
status: One or more devices contains corrupted data.
action: The pool can be imported despite missing or damaged devices. The
fault tolerance of the pool may be compromised if imported.
see: http://zfsonlinux.org/msg/ZFS-8000-4J
config:
backup DEGRADED
mirror-0 DEGRADED
sdj ONLINE
15917724193338767979 UNAVAIL
mirror-1 DEGRADED
sdi ONLINE
2036074377517197082 UNAVAIL
localhost ~ # zpool import -d foo -o readonly=on backup
localhost ~ # zpool status
pool: backup
state: DEGRADED
status: One or more devices could not be used because the label is missing or
invalid. Sufficient replicas exist for the pool to continue
functioning in a degraded state.
action: Replace the device using 'zpool replace'.
see: http://zfsonlinux.org/msg/ZFS-8000-4J
scan: none requested
config:
NAME STATE READ WRITE CKSUM
backup DEGRADED 0 0 0
mirror-0 DEGRADED 0 0 0
sdj ONLINE 0 0 0
15917724193338767979 UNAVAIL 0 0 0 was /dev/ada4
mirror-1 DEGRADED 0 0 0
sdi ONLINE 0 0 0
2036074377517197082 UNAVAIL 0 0 0 was /dev/ada3
errors: No known data errors
localhost ~ # ls -l /backup
total 3
drwxr-xr-x 18 root root 57 Dec 29 2009 linux1
localhost ~ # zdb -l /dev/sdi
--------------------------------------------
LABEL 0
--------------------------------------------
version: 22
name: 'backup'
state: 1
txg: 9359
pool_guid: 3472166890449163768
hostid: 12756365
hostname: 'hardy-core_installcd'
top_guid: 10964104056227665195
guid: 2683414997155590814
vdev_children: 2
vdev_tree:
type: 'mirror'
id: 1
guid: 10964104056227665195
metaslab_array: 27
metaslab_shift: 32
ashift: 9
asize: 750151532544
is_log: 0
children[0]:
type: 'disk'
id: 0
guid: 2683414997155590814
path: '/dev/dsk/c2t0d0p0'
devid: 'id1,sd@AST3750640AS=____________3QD0N0Q6/q'
phys_path: '/pci@0,0/pci1043,8239@5,1/disk@0,0:q'
whole_disk: 0
children[1]:
type: 'disk'
id: 1
guid: 2036074377517197082
path: '/dev/ada3'
whole_disk: 0
not_present: 1
DTL: 31
[etc.]
localhost ~ # zdb -l /dev/sdj
--------------------------------------------
LABEL 0
--------------------------------------------
version: 22
name: 'backup'
state: 1
txg: 9359
pool_guid: 3472166890449163768
hostid: 12756365
hostname: 'hardy-core_installcd'
top_guid: 7962344807192492976
guid: 18134448193074441749
vdev_children: 2
vdev_tree:
type: 'mirror'
id: 0
guid: 7962344807192492976
metaslab_array: 23
metaslab_shift: 31
ashift: 9
asize: 400083648512
is_log: 0
children[0]:
type: 'disk'
id: 0
guid: 18134448193074441749
path: '/dev/dsk/c3t0d0p0'
devid: 'id1,sd@AWDC_WD4000YR-01PLB0=_____WD-WMAMY1580352/q'
phys_path: '/pci@0,0/pci1043,8239@5,2/disk@0,0:q'
whole_disk: 0
children[1]:
type: 'disk'
id: 1
guid: 15917724193338767979
path: '/dev/ada4'
whole_disk: 0
not_present: 1
DTL: 29
[etc.]
Nothing in dmesg
...
Oh hey, I found another drive, I can import it too!
localhost ~ # zpool export backup
localhost ~ # cp -avi /dev/sdh
sdh sdh1 sdh10 sdh2 sdh3 sdh4 sdh5 sdh6 sdh7 sdh8 sdh9
localhost ~ # cp -avi /dev/sdh* foo/
'/dev/sdh' -> 'foo/sdh'
'/dev/sdh1' -> 'foo/sdh1'
'/dev/sdh10' -> 'foo/sdh10'
'/dev/sdh2' -> 'foo/sdh2'
'/dev/sdh3' -> 'foo/sdh3'
'/dev/sdh4' -> 'foo/sdh4'
'/dev/sdh5' -> 'foo/sdh5'
'/dev/sdh6' -> 'foo/sdh6'
'/dev/sdh7' -> 'foo/sdh7'
'/dev/sdh8' -> 'foo/sdh8'
'/dev/sdh9' -> 'foo/sdh9'
localhost ~ # zpool import -d foo
pool: backup
id: 3472166890449163768
state: DEGRADED
status: One or more devices contains corrupted data.
action: The pool can be imported despite missing or damaged devices. The
fault tolerance of the pool may be compromised if imported.
see: http://zfsonlinux.org/msg/ZFS-8000-4J
config:
backup DEGRADED
mirror-0 ONLINE
sdj ONLINE
sdh ONLINE
mirror-1 DEGRADED
sdi ONLINE
2036074377517197082 UNAVAIL
localhost ~ # zpool import -d foo -o readonly=on backup
localhost ~ # zpool status
pool: backup
state: DEGRADED
status: One or more devices could not be used because the label is missing or
invalid. Sufficient replicas exist for the pool to continue
functioning in a degraded state.
action: Replace the device using 'zpool replace'.
see: http://zfsonlinux.org/msg/ZFS-8000-4J
scan: none requested
config:
NAME STATE READ WRITE CKSUM
backup DEGRADED 0 0 0
mirror-0 ONLINE 0 0 0
sdj ONLINE 0 0 0
sdh ONLINE 0 0 1
mirror-1 DEGRADED 0 0 0
sdi ONLINE 0 0 0
2036074377517197082 UNAVAIL 0 0 0 was /dev/ada3
errors: No known data errors
localhost ~ # zdb -l /dev/sdh
--------------------------------------------
LABEL 0
--------------------------------------------
failed to unpack label 0
--------------------------------------------
LABEL 1
--------------------------------------------
version: 13
name: 'backup'
state: 1
txg: 16
pool_guid: 3472166890449163768
hostid: 2558674546
hostname: 'vulpis'
top_guid: 7962344807192492976
guid: 15917724193338767979
vdev_tree:
type: 'mirror'
id: 0
guid: 7962344807192492976
metaslab_array: 23
metaslab_shift: 31
ashift: 9
asize: 400083648512
is_log: 0
children[0]:
type: 'disk'
id: 0
guid: 18134448193074441749
path: '/dev/ad16'
whole_disk: 0
children[1]:
type: 'disk'
id: 1
guid: 15917724193338767979
path: '/dev/ada4'
whole_disk: 0
I don't know why LABEL 0
fails to unpack, Labels 1,2, and 3, are still unpackable.
Anyway, so I guess I'll pull /dev/sdi
and see what happens....
Removed one drive.... nothing exciting happened.... But, then I removed the other drive, leaving only the drive I had tried to import [GUID: 15917724193338767979] back in May, when I opened this bug report... And kernel panic!
First the boring part:
localhost ~ # rm -v foo/sdi
removed 'foo/sdi'
localhost ~ # zpool import -d foo
pool: backup
id: 3472166890449163768
state: UNAVAIL
status: One or more devices are missing from the system.
action: The pool cannot be imported. Attach the missing
devices and try again.
see: http://zfsonlinux.org/msg/ZFS-8000-6X
config:
backup UNAVAIL missing device
mirror-0 ONLINE
sdj ONLINE
sdh ONLINE
Additional devices are known to be part of this pool, though their
exact configuration cannot be determined.
localhost ~ # zpool import -d foo -o readonly=on backup
cannot import 'backup': one or more devices is currently unavailable
localhost ~ # zpool status
no pools available
And then this triggers the kernel panic....
localhost ~ # rm -v foo/sdj*
removed 'foo/sdj'
removed 'foo/sdj1'
removed 'foo/sdj2'
removed 'foo/sdj3'
removed 'foo/sdj4'
localhost ~ # zpool import -d foo
pool: backup
id: 3472166890449163768
state: UNAVAIL
status: One or more devices are missing from the system.
action: The pool cannot be imported. Attach the missing
devices and try again.
see: http://zfsonlinux.org/msg/ZFS-8000-6X
config:
backup UNAVAIL missing device
mirror-0 DEGRADED
ad16 UNAVAIL
sdh ONLINE
Additional devices are known to be part of this pool, though their
exact configuration cannot be determined.
localhost ~ # zpool import -d foo -o readonly=on backup
And then block forever... So here's the stack dump...
[ 2166.289386] WARNING: blkptr at ffff88080340a048 DVA 1 has invalid VDEV 1
[ 2166.296987] WARNING: blkptr at ffff8800c5f94440 DVA 1 has invalid VDEV 1
[ 2166.306732] WARNING: blkptr at ffff880803172840 DVA 1 has invalid VDEV 1
[ 2166.312851] VERIFY3(rvd->vdev_children == mrvd->vdev_children) failed (1 == 2)
[ 2166.312895] PANIC at spa.c:1717:spa_config_valid()
[ 2166.312958] Showing stack for process 8978
[ 2166.312960] CPU: 3 PID: 8978 Comm: zpool Tainted: P O 4.6.7B #6
[ 2166.312961] Hardware name: Supermicro X10SLL-F/X10SLL-SF/X10SLL-F/X10SLL-SF, BIOS 1.0a 06/11/2013
[ 2166.312962] 0000000000000000 ffff880803a1b9e8 ffffffff813cc8d6 ffffffffc1478822
[ 2166.312965] ffffffffc14787f4 ffff880803a1b9f8 ffffffffc12659e5 ffff880803a1bb98
[ 2166.312967] ffffffffc1265bda 0000000000000000 0000000000000001 ffff880800000030
[ 2166.312969] Call Trace:
[ 2166.312973] [<ffffffff813cc8d6>] dump_stack+0x4d/0x63
[ 2166.312982] [<ffffffffc12659e5>] spl_dumpstack+0x3d/0x3f [spl]
[ 2166.312986] [<ffffffffc1265bda>] spl_panic+0xb8/0xf6 [spl]
[ 2166.313021] [<ffffffffc13847f2>] ? spa_config_parse+0x99/0x100 [zfs]
[ 2166.313050] [<ffffffffc13898c1>] spa_load+0x1386/0x1c7a [zfs]
[ 2166.313055] [<ffffffffc12b50d8>] ? zpool_get_rewind_policy+0x116/0x13c [zcommon]
[ 2166.313083] [<ffffffffc138a21e>] spa_load_best+0x69/0x251 [zfs]
[ 2166.313085] [<ffffffffc12b504a>] ? zpool_get_rewind_policy+0x88/0x13c [zcommon]
[ 2166.313113] [<ffffffffc138adf6>] spa_import+0x19f/0x653 [zfs]
[ 2166.313146] [<ffffffffc13c4827>] zfs_ioc_pool_import+0xaf/0xec [zfs]
[ 2166.313181] [<ffffffffc13ca0d9>] zfsdev_ioctl+0x40e/0x521 [zfs]
[ 2166.313184] [<ffffffff811547d4>] vfs_ioctl+0x1c/0x2f
[ 2166.313185] [<ffffffff81154e46>] do_vfs_ioctl+0x5cb/0x60e
[ 2166.313188] [<ffffffff8103f21a>] ? __do_page_fault+0x35f/0x4b5
[ 2166.313189] [<ffffffff81154ec2>] SyS_ioctl+0x39/0x61
[ 2166.313191] [<ffffffff81603a5f>] entry_SYSCALL_64_fastpath+0x17/0x93
This issue has been automatically marked as "stale" because it has not had any activity for a while. It will be closed in 90 days if no further activity occurs. Thank you for your contributions.
So, I found an old hard drive from around 2009, which was half of a mirror apparently; plugged it in, and tried to import the pool. Then got a kernel panic...
...And then the zpool process blocks on I/O forever. (In the D state)