Closed phik closed 1 year ago
Addressing the issue described in the subject: check out the spa_load_verify_data
, spa_load_verify_metadata
and spa_load_verify_maxinflight
module parameters. The "extreme" rewind imports will, by default, try to traverse the pool which may take a very long time.
Thanks!
I'm not sure that's a super-useful default. I appreciate the nod towards safety in theory, but the actual result was to render it useless, and drive us to using an extremely-dodgy Python script that uses dd to zero out filesystem blocks (!).
So maybe file that under "unintended consequences".
At the very least I would change "zpool import -T" -- if it sees that one of spa_load_verify_*
are non-zero -- to warn that it will be doing a glacial pool traversal, and direct people towards those parameters if they would like to recover their filesystem before the heat death of the universe.
But at least now we know it's there. Much appreciated!
The zpool import -T
documentation could certainly use some clarification and/or cross-referencing. I guess I was sufficiently happy to get those tunables documented that I overlooked the fact that they really ought to be referenced in the main zpool
documentation.
@dweeezil thanks for reminding me about these tunings, they had entirely slipped my mind.
if they would like to recover their filesystem before the heat death of the universe.
@phik maybe even worse than that. The spa_load_verify()
logic currently abandons any TXG which has even a single unrecoverable error found in the metadata. You would need to disable the meta checks entirely to recover a pool in this state.
So maybe file that under "unintended consequences".
Indeed. These default values could end up making the -X
and -T
recovery options almost useless.
That is unless you're importing a relatively small pool. That might have been a reasonable assumption when this code was originally written, but now I've gotten the impression they're doing more harm than good.
@phik @dweeezil what do you think of the following proposal:
Disable the data verification by default. For most pools this should represent the bulk of the time required to check a given TXG. And my sense is most/all administrators resorting to these options would be thrilled to be able to quickly recover the pool at the possible expense of some lost files.
Enable metadata verification by default. Checking the metadata should be reasonably fast since it's typically less than 1% of the pool data (with the possible exception of Lustre MDS pools). Leaving this enabled also gives us a reasonable assessment of how intact a given TXG is.
Retire the spa_load_verify_*
module options and move this functionality in to the zpool import
command itself. This has the advantage of making the options far more discover-able when you need them. And it gives us a way to specify the number of data and metadata error we're willing to tolerate when importing the pool. This functionality currently exists in the rewind policy but there is no way to override the default values (0 metadata errors, unlimited data errors). We can overload these options to allow the checks to be disabled entirely.
Thanks, Brian. My first reaction is that this sounds pretty sensible.
As it happens, this corruption was on a Lustre MDS pool, so it sounds like we'd possibly still need to take heroic action to recover in finite time -- but as you say, that may still be a reasonable default. So long as that behaviour and mitigation are discoverable, I don't think that's a problem.
@behlendorf That proposal sounds perfectly fine to me. When I first documented these tunables, I was certainly thinking that at the very least additional documentation ought to be added to the zpool import -T
case (now that we actually document it), but changing the behavior as you suggested makes a whole lot more sense (plus the requisite additional documentation).
The upstream OpenZFS work in this area has been merged, once ported we can tackle additional improvements.
I stumbled on this thread and issue #6414, I'm having an identical issue as @phik and have 4 x 8TB drives that are panicking when being mounted. I'm tempted to use the script mentioned (https://gist.github.com/jshoward/5685757), but its indeed terrifying, can you give some guidance into how exactly you used it so I can roll back some transactions and mount the filesystem?
Thanks in advance
@behlendorf You guys are much more knowledgeable at ZFS than I am, so forgive me if what I'm saying doesn't make sense. Is there a way I can attempt to import -T txg, without doing a month long read of the volume, like mentioned above? or should I attempt to use the script to zero the latest transactions?
Morning. I was going to write some documentation on how to do this, but never got around to it.
What I suggest doing is "try before you buy". You can create a write-back file system which will allow you to attempt the recovery... and if its successful, commit it.
http://stackoverflow.com/questions/7582019/lvm-like-snapshot-on-a-normal-block-device
What you do is create a file system where all the writes go to another disk/file (so your original disk doesn't get modified), try rolling back with the script... and if its successful, break down the write-backed file system and try the roll back on the real disks/filesystem.
eg. consider that sdb is your issue
dd if=/dev/zero bs=1048576 count=0 seek=9600000 of=tempfile-sdb
losetup -f tempfile-sdb
echo 0 $(blockdev --getsz /dev/sdb) snapshot /dev/sdb /dev/loop0 p 4 | dmsetup create snapsdb
now you will have
/dev/mapper/snapsdb
which you can use to try and reconstruct your zfs volume without modifying sdb.
Using this method, you can try rolling back "lots" of times... and try to get back to the most recent TXG that works.
It looks scary... but it worked well for @Phik and my file system.
That seems like a great idea. Sorry, I dont mean to hijack this thread...
I tried creating a snapshot like you mentioned, it claims that the particular device is busy. For instance:
root@ubuntu:~# echo 0 $(blockdev --getsz /dev/sdc) snapshot /dev/sdc /dev/loop0 p 4 | dmsetup create snapsdb
device-mapper: reload ioctl on snapsdb failed: Device or resource busy
Command failed
Here are my disks:
root@ubuntu:~# zpool import
pool: Storage
id: 8070972259976308683
state: ONLINE
status: The pool was last accessed by another system.
action: The pool can be imported using its name or numeric identifier and
the '-f' flag.
see: http://zfsonlinux.org/msg/ZFS-8000-EY
config:
Storage ONLINE
raidz1-0 ONLINE
sdc ONLINE
sdb ONLINE
sda ONLINE
sdd ONLINE
root@ubuntu:~# zdb -AAA -F -e Storage
Configuration for import:
vdev_children: 1
version: 5000
pool_guid: 8070972259976308683
name: 'Storage'
state: 0
hostid: 272134401
hostname: 'freenas.local'
vdev_tree:
type: 'root'
id: 0
guid: 8070972259976308683
children[0]:
type: 'raidz'
id: 0
guid: 11679850201028184759
nparity: 1
metaslab_array: 35
metaslab_shift: 38
ashift: 12
asize: 31997643718656
is_log: 0
create_txg: 4
children[0]:
type: 'disk'
id: 0
guid: 7655929562872552401
whole_disk: 1
DTL: 127
create_txg: 4
path: '/dev/sdc2'
children[1]:
type: 'disk'
id: 1
guid: 14007418575690057489
whole_disk: 1
DTL: 126
create_txg: 4
path: '/dev/sdb2'
children[2]:
type: 'disk'
id: 2
guid: 7167084592307323902
whole_disk: 1
DTL: 125
create_txg: 4
path: '/dev/sda2'
children[3]:
type: 'disk'
id: 3
guid: 1941913123584907009
whole_disk: 1
DTL: 124
create_txg: 4
path: '/dev/sdd2'
zdb: ../../module/zfs/zio.c:230: Assertion `c < (1ULL << 24) >> 9 (0x7fffffffffffff < 0x8000)' failed.
Aborted (core dumped)
This script worked flawlessly, I made a dd copy of all of my drives. ran this script on each of the drives https://gist.github.com/jshoward/5685757 and reverted back to an old txg.
Thanks!
Great news.
On Thu, 21 Jun 2018 at 06:20, Max Dunlap notifications@github.com wrote:
This script worked flawlessly, I made a dd copy of all of my drives. ran this script on each of the drives https://gist.github.com/jshoward/5685757 and reverted back to an old txg.
Thanks!
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/zfsonlinux/zfs/issues/6497#issuecomment-398916076, or mute the thread https://github.com/notifications/unsubscribe-auth/AdFylq8OjvownemAHgmtM2mNVBZW-5ADks5t-srBgaJpZM4O0NhG .
-- Dr Stuart Midgley sdm900@gmail.com
I'm now encountering this problem as well... I didn't notice that a package download for an update had filled my root drive, an Ubuntu 18.04 image running ZFS on root. No no matter what I do I can't seem to import the pool to clear out the downloaded files and boot the server.
I've made copies of the disk and tried importing with the "-FX" option which is currently just hanging indefinitely, I tried to import back to an earlier transaction which works with the "-n" option, but when I try it for real it tells me one of the devices is busy...
And finally, I clone everything to another new disk and tried the revert script, which also doesn't seem to work. It found the transactions, I zero'd out back to the earliest one I could... and it still won't import.
Any new ideas?
You have to use the zfs script I believe I linked the post to. It’s the only thing that saved me. Good on you for making clones run the script on there and roll back some transactions. It’s not elegant but it saved my ass.
On Mon, Sep 30, 2019 at 10:02 AM stevenmcastano notifications@github.com wrote:
I'm now encountering this problem as well... I didn't notice that a package download for an update had filled my root drive, an Ubuntu 18.04 image running ZFS on root. No no matter what I do I can't seem to import the pool to clear out the downloaded files and boot the server.
I've made copies of the disk and tried importing with the "-FX" option which is currently just hanging indefinitely, I tried to import back to an earlier transaction which works with the "-n" option, but when I try it for real it tells me one of the devices is busy...
And finally, I clone everything to another new disk and tried the revert script, which also doesn't seem to work. It found the transactions, I zero'd out back to the earliest one I could... and it still won't import.
Any new ideas?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/zfsonlinux/zfs/issues/6497?email_source=notifications&email_token=AAN4REICBQEANDNSRVXDPALQMIWJZA5CNFSM4DWQ3BDKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD76LFLQ#issuecomment-536654510, or mute the thread https://github.com/notifications/unsubscribe-auth/AAN4REK5J63RAAN7VV7AL73QMIWJZANCNFSM4DWQ3BDA .
@mdunlap, I did run the script.... on multiple different clones of the drive. I ran it on some, didn't run it on others... it still didn't help. I have imports sitting basically idle for 8+ hours now and none of them have actually imported.
@mdunlap Can you comment on exactly how you used the script? It seems to require: 1) a blocksize and 2) total blocks. I know how to list uberblocks with
zdb -l <device> -u
which lists a bunch of uberblocks in nonchronological order. But I cannot locate their size, and neither do I know exactly what total blocks means. Can you help, please?
It’s been quite a while so I don’t remember exactly. I thought I posted on here how I did it.
Found disk block size: blockdev --getsize64 /dev/sda
so like:
zfs_revert.py -bs=512 -tb=107374182400 /dev/sda
I ran this for each disk
The -tb flag is for the txg or transaction you want to roll back to, I remember listing them some how and rolling back to the most recent one, perhaps it’s something like
zbd -hh
@mdunlap https://github.com/mdunlap Can you comment on exactly how you used the script? It seems to require: 1) a blocksize and 2) total blocks. I know how to list uberblocks with
zdb -l
-u which lists a bunch of uberblocks in nonchronological order. But I cannot locate their size, and neither do I know exactly what total blocks means. Can you help, please?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/openzfs/zfs/issues/6497#issuecomment-629731774, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAN4REIFNXR7GLOSEPX2UATRR5BHZANCNFSM4DWQ3BDA .
On FreeBSD sysctl vfs.zfs.spa.load_verify_metadata=0
helps zpool import -T txg
.
See also https://www.delphix.com/blog/openzfs-pool-import-recovery
So after spending a lot of time trying to recover a corrupted ZFS pool (still in progress) using the -T
option here are my remarks.
-T
is definitely a bad one. In my case I have a 24 disk RAIDZ2 and verification (as you can imagine) can take ages (talking 5+ days here)sysctl vfs.zfs.spa.load_verify_metadata=0
as @jbeich commented earlier works in that it lets you mount the pool at an earlier point in time when its not corrupted skipping verifying all of the data however it doesn't actually delete the later corrupted txg's (i.e. the ones after the -T
point). This means that unless you do a scrub
(which will take a huge amount of time, same issue with point one) then when the pool gets automatically remounted later on (I am using TrueNas here) with the latest corrupted tgx so its not actually solving anything. This essentially makes the sysctl vfs.zfs.spa.load_verify_metadata=0
pointless because you will either have to scrub
or use the zfs_revert.py
script.Points 1 and 2 combined really make a terrible initial user experience when it comes to recovering a corrupted pool. While I understand that zfs and its whole design enforces data integrity as highest priority there are cases where it ends up becoming ridiculous. In my case I wasted a week trying to recover a pool using the -X
option, I didn't know at the time it would take so long because there was no progress bar and hence there is no indication that it would have taken that long (even a simple "WARNING: This command can take a long amount of time because it needs to verify EVERY block on the pool" would have been great!).
Right now I am doing a zpool import -f -T <tgx> <pool>
hoping that it will actually also clear all of the later corrupted tgx's, if it doesn't then I need to use the zfs_revert.py
script (which I had to manually fix because some syntax in python's print
changed in the latest python version). Honestly the zfs_revert.py
script should be integrated into the zpool import
command under a flag that is enabled if you also use the -T
which will cause zpool import
to automatically clear any tgx after the -T
if you don't verify the data using sysctl vfs.zfs.spa.load_verify_metadata=0
(and also if you do verify metadata if the current behavior doesn't do this).
This would allow administrators of zpools to recover a corrupted pool easily using a zpool import -f -TW <tgx> <pool>
(where W
is this new supposed flag) along with sysctl vfs.zfs.spa.load_verify_metadata=0
(or better defaults) without having to spend days (or weeks) on really large pools.
In my case, I don't really care losing the latest 1-2 tgx's because the corruption occurred due to some RAIDZ2 write hole with my HBA, so its probably the last few files which are corrupted (which I don't care about losing). I suspect that many other people in a similar position would likewise prefer an easy way to mount the pool by either using the latest uncorrupted TGX (if this is easy and fast to calculate) or a specific one and wipe any tgx's afterwards without having to read the entire pool to verify (taking weeks). Of course all of this mentioned should be behind well documented flags that also warn the user.
As a final point I would also like to mention that verifying all of the data on the pool with -T
does put a lot of wear and tear on hard disks in the system which is pointless in scenarios where its not needed (i.e. the one I am experiencing).
I had to use the zfs_revert.py
script mentioned above to do this.
Quoting Matthew de Detrich (2021-06-25 04:32:21)
So after spending a lot of time trying to recover a corrupted ZFS pool (still in progress) using the -T option here are my remarks.
- The default of reading every single file when doing -T is definitely a bad one. In my case I have a 24 disk RAIDZ2 and reading every single file (as you can imagine) can take ages (talking 5+ days here)
- Using sysctl vfs.zfs.spa.load_verify_metadata=0 as @jbeich commented earlier works in that it lets you mount the pool at an earlier point in time when its not corrupted however it doesn't actually delete the later corrupted txg's (i.e. the ones after the -T point). This means that unless you do a scrub (which will take a huge amount of time, same issue with point one) then when the pool gets automatically remounted later on (I am using TrueNas here) it will remount with the latest corrupted tgx. This essentially makes the sysctl vfs.zfs.spa.load_verify_metadata=0 pointless because you will either have to scrub or use the zfs_revert.py script.
Points 1 and 2 combined really make a terrible initial user experience when it comes to recovering a corrupted pool. While I understand that zfs and its whole design enforces data integrity as highest priority there are cases where it ends up becoming ridiculous. In my case I wasted a week trying to recover a pool using the -X option, I didn't know at the time it would take so long because there was no progress bar and hence there is no indication that it would have taken that long (even a simple "WARNING: This command can take a long amount of time because it needs to ready EVERY block on the pool" would have been enough).
Right now I am doing a zpool import -f -T
hoping that it will actually also clear all of the later corrupted tgx's, if it doesn't then I need to use the zfs_revert.py script (which I had to manually fix because some syntax in python's print changed). Honestly the zfs_revert.py script should be integrated into the zpool import command under a flag that is enabled if you also use the -T which will cause zpool import to automatically clear any tgx after the -T if you don't verify the metadata with sysctl vfs.zfs.spa.load_verify_metadata=0 (and also if you do verify metadata if the current behavior doesn't do this). This would allow administrators of zpools to recover a corrupted pool easily using a zpool import -f -TW
(where W is this new supposed flag) along with sysctl vfs.zfs.spa.load_verify_metadata=0 (or better defaults) without having to spend days (or weeks) on really large pools. In my case, I don't really care losing the latest 1-2 tgx's because the corruption occurred due to some RAIDZ2 write whole with my HBA, so its probably the last few files which are corrupted (which I don't care about losing). I suspect that many other people in a similar position would likewise prefer an easy way to mount the filesystem by either using the latest uncorrupted TGX (if this is easy to find) or a specific one and wipe any tgx's afterwards without having to read the entire pool to verify (taking weeks).
— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.*
Thanks @arjunkc . I have a quick question regarding the zfs_revert.py
script, it requires a -bs
(i.e. blocksize) command line argument, earlier @mdunlap mentioned using a blockdev
program to get this information however it doesn't appear to exist on TrueNAS/FreeBSD anymore (after some basic googling apparently FreeBSD doesn't rely/use blocksize anymore???).
Does anyone know how to get the -bs
parameter for a /dev/
device on TrueNAS/FreeBSD or is this an optional parameter that can be ignored?
Hi @mdedetrich, I'm possibly in the same situation as you. I'm using Proxmox VE and can't import my pool just now because of data corruption. Possibly due to bad SATA cables. I'm now using zpool import -f -F -X pollname
, but there is no progress. I don't know how long it will take...
What did you do eventually? Do you end up to use zfs_revert.py
? What are your steps?
So what I ended up doing was using zpool import -f -F -T <tgx> poolname
rather than zpool import -f -F -X pollname
. This allows you to mount the pool an earlier point earlier in time then when the data corruption happened. I think the zfs_revert.py
script also had a command line option to show the last few uncorrupted tgx's but I did not use the script to recover the pool (see https://www.reddit.com/r/zfs/comments/oer0z0/how_to_calculate_bs_argument_for_zfs_revertpy/h4qp3u8/?context=3 for more details).
After that I mounted the pool so its possible to write to it and then I wrote some dummy data as files using dd
to create new tgx's which will override the corrupted ones.
This ended up fixing the pool and I could mount it just fine aftewards
but there is no progress. I don't know how long it will take...
Any of these commands will take ages since it forces ZFS to do a scrub to make sure the data is all correct. In my case it took like a week.
Thank you @mdedetrich for the reply. After running zpool import -f -F -X pollname
for a week I gave up. Then finally I have recovered my zpool using zpool import -f -F -T <tgx> poolname
. As a noob I will document the steps I have taken which I feel it's not good... There should be better steps on data recovery like trying with different txgs.
Firstly I used zdb -hh -e zfs52
trying to find the txg. The output was like this:
......
2021-08-26.09:07:12 zpool import -N -d /dev/disk/by-id -o cachefile=none zfs52
history command: 'zpool import -N -d /dev/disk/by-id -o cachefile=none zfs52'
history zone: 'linux'
history who: 0
history time: 1629940032
history hostname: 'pve52'
unrecognized record:
history internal str: 'func=2 mintxg=2675277 maxtxg=2675279'
internal_name: 'scan setup'
history txg: 2675284
history time: 1629940032
history hostname: 'pve52'
unrecognized record:
history internal str: 'errors=18'
internal_name: 'scan done'
history txg: 2675286
history time: 1629940032
history hostname: 'pve52'
unrecognized record:
ioctl: 'load-key'
in_nvl:
history zone: 'linux'
history who: 0
history time: 1629940086
history hostname: 'pve52'
2021-08-26.09:08:06 zfs load-key zfs52/enc
history command: 'zfs load-key zfs52/enc'
history zone: 'linux'
history who: 0
history time: 1629940086
history hostname: 'pve52'
unrecognized record:
history internal str: 'func=1 mintxg=0 maxtxg=2675332'
internal_name: 'scan setup'
history txg: 2675332
history time: 1629940214
history hostname: 'pve52'
2021-08-26.09:10:32 zpool scrub zfs52
history command: 'zpool scrub zfs52'
history zone: 'linux'
history who: 0
history time: 1629940232
history hostname: 'pve52'
Then I tried to import pool at txg 2675332
echo 0 > /sys/module/zfs/parameters/spa_load_verify_metadata
zpool import -f -F -T 2675332 zfs52
zfs load-key zfs52/enc
zfs mount -a
Then it appears it works, but I still get this error
cannot iterate filesystems: I/O error
Then I want to import the pool at an even earlier txg. I have done these:
shutdown now
echo 0 > /sys/module/zfs/parameters/spa_load_verify_metadata
zpool import -f -F -T 2652743 zfs52
But I can't import the pool at the 2652743 txg. The error message was like a disk was missing.
I assume I should firstly try import in readonly mode like with -o readonly=on
but I did not try. Anyway the data on the disks are not that important...
Anyway I have to re-import the pool like this and ignore that I/O error.
zpool import -f zfs52
zpool import -F zfs52
Then I do a scrub some dd to write some dummy files (Is it like this?)
zpool scrub zfs52
dd if=/dev/zero of=/zfs52/enc/dd1.bin bs=1M
dd if=/dev/zero of=/zfs52/enc/dd2.bin bs=1M
dd if=/dev/zero of=/zfs52/enc/dd3.bin bs=1M
After all these I tried to access the files zfs52/enc/vol/vm-115-disk-0, but it's unaccessable (it seems the zfs52/enc/vol dataset is missing). Anyway most other data are back, I'm satisfied.
root@pve52:~# zpool status -v
pool: zfs52
state: DEGRADED
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
entire pool from backup.
see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A
scan: scrub repaired 8K in 1 days 00:48:01 with 8 errors on Mon Sep 13 02:48:13 2021
config:
NAME STATE READ WRITE CKSUM
zfs52 DEGRADED 0 0 0
raidz1-0 DEGRADED 0 0 0
ata-WDC_WD120EMFZ-11A6JA0_QGG3AB2T DEGRADED 0 0 22 too many errors
ata-WDC_WD120EMFZ-11A6JA0_QGGDS91T DEGRADED 0 0 17 too many errors
ata-WDC_WD120EMFZ-11A6JA0_QGGE07HT DEGRADED 0 0 22 too many errors
ata-WDC_WD120EMFZ-11A6JA0_QGGL1ZVT DEGRADED 0 0 22 too many errors
ata-WDC_WD120EMFZ-11A6JA0_QGH5V04T DEGRADED 0 0 18 too many errors
ata-WDC_WD120EMFZ-11A6JA0_X1G502KL DEGRADED 0 0 18 too many errors
ata-WDC_WD120EMFZ-11A6JA0_X1G6H9LL DEGRADED 0 0 14 too many errors
ata-WDC_WD120EMFZ-11A6JA0_X1G9TYHL DEGRADED 0 0 16 too many errors
ata-WDC_WD120EMFZ-11A6JA0_XHG0J1MD DEGRADED 0 0 21 too many errors
errors: Permanent errors have been detected in the following files:
zfs52/enc/dir:<0xf0583>
zfs52/enc/vol/vm-115-disk-0:<0x0>
zfs52/enc/vol/vm-112-disk-0:<0x0>
zfs52/enc/vol/subvol-103-disk-0:<0x0>
zfs52/enc/vol/vm-151-disk-0:<0x0>
zfs52/enc/vol/vm-151-disk-2:<0x0>
zfs52/enc/vol/vm-113-disk-0:<0x0>
zfs52/enc/vol/vm-116-disk-0:<0x0>
root@pve52:~# rm -r /zfs52/enc/vol/vm-115-disk-0
rm: cannot remove '/zfs52/enc/vol/vm-115-disk-0': No such file or directory
root@pve52:~# ls /zfs52/enc/
dir
root@pve52:~# zfs list
cannot iterate filesystems: I/O error
NAME USED AVAIL REFER MOUNTPOINT
zfs52 70.8T 13.8T 171K /zfs52
zfs52/enc 70.8T 13.8T 384K /zfs52/enc
zfs52/enc/dir 69.8T 13.8T 69.8T /zfs52/enc/dir
zfs52/enc/vol 992G 13.8T 441K /zfs52/enc/vol
zfs52/plain 171K 13.8T 171K /zfs52/plain
root@pve52:~#
This issue has been automatically marked as "stale" because it has not had any activity for a while. It will be closed in 90 days if no further activity occurs. Thank you for your contributions.
In issue #6414, a bug caused the creation of an invalid block but with a valid checksum -- leading to a kernel panic on import.
zpool import -FX
didn't work, hitting the same panic (which is the topic of #6496)We also attempted
zpool import -T txg
, and while this didn't cause the panic, it also didn't really work. It started reading the (10 TB x 4) filesystem at about 2 MB/s, an effort that we gave up on after about 4 hours.Manually zeroing the latest transactions (using the terrifying zfs_revert script at https://gist.github.com/jshoward/5685757) does work, however, and imports instantly.
So the fundamental idea of mounting this particular damaged filesystem from an earlier transaction is sound -- just need to figure out why
import -T
wants to do a 23-day read of the filesystem first.