Closed nerdcorenet closed 3 years ago
This issue is also happening to me. Any related process accessing "trigger file" would hangs there forever.
Message from syslogd@qaq-server at Oct 22 20:48:15 ...
kernel:[ 762.892839] VERIFY(0 == sa_handle_get_from_db(zfsvfs->z_os, db, zp, SA_HDL_SHARED, &zp->z_sa_hdl)) failed
Message from syslogd@qaq-server at Oct 22 20:48:15 ...
kernel:[ 762.892844] PANIC at zfs_znode.c:335:zfs_znode_sa_init()
In additionally, here is my system spec:
zfs-0.8.3-1ubuntu12.4
zfs-kmod-0.8.3-1ubuntu12.4
Linux qaq-server 5.4.0-52-generic #57-Ubuntu SMP Thu Oct 15 10:57:00 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
I just started hitting this on Ubuntu Hirsute (development release) in the last couple of days for some unclear reason. The stacks all show code related to SA and for whatever reason it was happening with multiple Chrome/Electron apps trying to access the "Cache" dir specifically - but different instances of the cache dir in different paths (e.g. ~/.cache/google-chrome/Default/Cache and ~/.config/Mattermost/Cache) . Those processes stay hung forever and I can't strace/gdb them or even ls that same directory while the task is stuck presumably due to a lock or similar.
I had zfs-dkms installed, i removed that and went back to the version built with the kernel in Ubuntu and it's working OK but that version is 0.8.4-1ubuntu11 where as zfs-dkms was 0.8.4-1ubuntu16. They added quite a lot of patches in "ubuntu13" for Linux 5.9 compatability as part of https://bugs.launchpad.net/bugs/1899826 .. However given the other reporters were on stable versions it seems more likely they may be the same effect but different cause possibly?
Just reverting to the 0.8.4-1ubuntu11 code resolved it for me. I will try install zfs-dkms of the same version to see if it happens there in case it's some quirk of the DKMS build versus the build that happens in the Ubuntu kernel packages.
Happy to try debug if anyone has suggestions on what to look at. Reasonably competent programmer, debugger and very familiar with ZFS from an admin and various internals but not super familiar with the code-base as a whole. Can also look to try the native version and see if it hits or whether it's specific to the Ubuntu patches.
Also opened here: https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1906476
Note: not really expecting support for the Ubuntu patched version here so much as, this is the only Google hit for that error, so wanted to contribute information here in case it helps others and happy to try debug if that also helps.
Dec 2 12:36:42 optane kernel: [ 72.857033] VERIFY(0 == sa_handle_get_from_db(zfsvfs->z_os, db, zp, SA_HDL_SHARED, &zp->z_sa_hdl)) failed
Dec 2 12:36:42 optane kernel: [ 72.857036] PANIC at zfs_znode.c:335:zfs_znode_sa_init()
Dec 2 12:36:42 optane kernel: [ 72.857037] Showing stack for process 19744
Dec 2 12:36:42 optane kernel: [ 72.857038] CPU: 3 PID: 19744 Comm: ThreadPoolForeg Tainted: P OE 5.8.18-acso #1
Dec 2 12:36:42 optane kernel: [ 72.857039] Hardware name: Gigabyte Technology Co., Ltd. Z97X-Gaming G1 WIFI-BK/Z97X-Gaming G1 WIFI-BK, BIOS F8 09/19/2015
Dec 2 12:36:42 optane kernel: [ 72.857039] Call Trace:
Dec 2 12:36:42 optane kernel: [ 72.857044] dump_stack+0x74/0x95
Dec 2 12:36:42 optane kernel: [ 72.857053] spl_dumpstack+0x29/0x2b [spl]
Dec 2 12:36:42 optane kernel: [ 72.857057] spl_panic+0xd4/0xfc [spl]
Dec 2 12:36:42 optane kernel: [ 72.857101] ? sa_cache_constructor+0x27/0x50 [zfs]
Dec 2 12:36:42 optane kernel: [ 72.857103] ? _cond_resched+0x19/0x40
Dec 2 12:36:42 optane kernel: [ 72.857105] ? mutex_lock+0x12/0x40
Dec 2 12:36:42 optane kernel: [ 72.857129] ? dmu_buf_set_user_ie+0x54/0x80 [zfs]
Dec 2 12:36:42 optane kernel: [ 72.857167] zfs_znode_sa_init+0xe0/0xf0 [zfs]
Dec 2 12:36:42 optane kernel: [ 72.857205] zfs_znode_alloc+0x101/0x700 [zfs]
Dec 2 12:36:42 optane kernel: [ 72.857229] ? arc_buf_fill+0x270/0xd30 [zfs]
Dec 2 12:36:42 optane kernel: [ 72.857232] ? __cv_init+0x42/0x60 [spl]
Dec 2 12:36:42 optane kernel: [ 72.857260] ? dnode_cons+0x28f/0x2a0 [zfs]
Dec 2 12:36:42 optane kernel: [ 72.857262] ? _cond_resched+0x19/0x40
Dec 2 12:36:42 optane kernel: [ 72.857263] ? _cond_resched+0x19/0x40
Dec 2 12:36:42 optane kernel: [ 72.857264] ? mutex_lock+0x12/0x40
Dec 2 12:36:42 optane kernel: [ 72.857288] ? aggsum_add+0x153/0x170 [zfs]
Dec 2 12:36:42 optane kernel: [ 72.857292] ? spl_kmem_alloc_impl+0xd8/0x110 [spl]
Dec 2 12:36:42 optane kernel: [ 72.857316] ? arc_space_consume+0x54/0xe0 [zfs]
Dec 2 12:36:42 optane kernel: [ 72.857341] ? dbuf_read+0x4a0/0xb50 [zfs]
Dec 2 12:36:42 optane kernel: [ 72.857342] ? _cond_resched+0x19/0x40
Dec 2 12:36:42 optane kernel: [ 72.857343] ? mutex_lock+0x12/0x40
Dec 2 12:36:42 optane kernel: [ 72.857372] ? dnode_rele_and_unlock+0x5a/0xc0 [zfs]
Dec 2 12:36:42 optane kernel: [ 72.857373] ? _cond_resched+0x19/0x40
Dec 2 12:36:42 optane kernel: [ 72.857374] ? mutex_lock+0x12/0x40
Dec 2 12:36:42 optane kernel: [ 72.857400] ? dmu_object_info_from_dnode+0x84/0xb0 [zfs]
Dec 2 12:36:42 optane kernel: [ 72.857433] zfs_zget+0x1c3/0x270 [zfs]
Dec 2 12:36:42 optane kernel: [ 72.857457] ? dmu_buf_rele+0x3a/0x40 [zfs]
Dec 2 12:36:42 optane kernel: [ 72.857493] zfs_dirent_lock+0x349/0x680 [zfs]
Dec 2 12:36:42 optane kernel: [ 72.857530] zfs_dirlook+0x90/0x2a0 [zfs]
Dec 2 12:36:42 optane kernel: [ 72.857566] ? zfs_zaccess+0x10c/0x480 [zfs]
Dec 2 12:36:42 optane kernel: [ 72.857600] zfs_lookup+0x202/0x3b0 [zfs]
Dec 2 12:36:42 optane kernel: [ 72.857635] zpl_lookup+0xca/0x1e0 [zfs]
Dec 2 12:36:42 optane kernel: [ 72.857639] path_openat+0x6a2/0xfe0
Dec 2 12:36:42 optane kernel: [ 72.857641] do_filp_open+0x9b/0x110
Dec 2 12:36:42 optane kernel: [ 72.857645] ? __check_object_size+0xdb/0x1b0
Dec 2 12:36:42 optane kernel: [ 72.857647] ? __alloc_fd+0x46/0x170
Dec 2 12:36:42 optane kernel: [ 72.857649] do_sys_openat2+0x217/0x2d0
Dec 2 12:36:42 optane kernel: [ 72.857650] ? do_sys_openat2+0x217/0x2d0
Dec 2 12:36:42 optane kernel: [ 72.857651] do_sys_open+0x59/0x80
Dec 2 12:36:42 optane kernel: [ 72.857652] __x64_sys_openat+0x20/0x30
Dec 2 12:36:42 optane kernel: [ 72.857654] do_syscall_64+0x48/0xc0
Dec 2 12:36:42 optane kernel: [ 72.857656] entry_SYSCALL_64_after_hwframe+0x44/0xa9
Dec 2 12:36:42 optane kernel: [ 72.857657] RIP: 0033:0x7f9e3e7f62b4
Dec 2 12:36:42 optane kernel: [ 72.857659] Code: 24 20 eb 8f 66 90 44 89 54 24 0c e8 b6 f4 ff ff 44 8b 54 24 0c 44 89 e2 48 89 ee 41 89 c0 bf 9c ff ff ff b8 01 01 00 00 0f 05 <48> 3d 00 f0 ff ff 77 34 44 89 c7 89 44 24 0c e8 08 f5 ff ff 8b 44
Dec 2 12:36:42 optane kernel: [ 72.857659] RSP: 002b:00007f9e2a84aa10 EFLAGS: 00000293 ORIG_RAX: 0000000000000101
Dec 2 12:36:42 optane kernel: [ 72.857661] RAX: ffffffffffffffda RBX: 00007f9e2a84b070 RCX: 00007f9e3e7f62b4
Dec 2 12:36:42 optane kernel: [ 72.857661] RDX: 0000000000000002 RSI: 0000239c0c6ddf00 RDI: 00000000ffffff9c
Dec 2 12:36:42 optane kernel: [ 72.857662] RBP: 0000239c0c6ddf00 R08: 0000000000000000 R09: 00007ffc92524080
Dec 2 12:36:42 optane kernel: [ 72.857662] R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000002
Dec 2 12:36:42 optane kernel: [ 72.857663] R13: 00007f9e2a84b070 R14: 0000239c0d73c5c0 R15: 0000000000008061
Dec 2 12:36:42 optane kernel: [ 72.858063] VERIFY(0 == sa_handle_get_from_db(zfsvfs->z_os, db, zp, SA_HDL_SHARED, &zp->z_sa_hdl)) failed
I have found the same problem. Going back to 0.8.4-1ubuntu11 fixes it for new files. If I move the old chrome cache to a different name, the problem disappears, but if I try to remove it, list it, etc... there is some persistent corruption in the filesystem that triggers the panic.
I hit this problem again today, but now without zfs-dkms. After upgrading my kernel from 5.8.0-29-generic to 5.8.0-36-generic my Google Chrome Cache directory is broken again, had to rename it and then reboot to get out of the problem.
Curiously I found a 2016(??) report of similar here: https://bbs.archlinux.org/viewtopic.php?id=217204
The renamed directories still exist if any developers have an idea about anything I can do to try and debug or understand the issue.
Having a similar problem. Same traceback, different files. Just started with the Ubuntu 5.8.0-36 kernel. Unfortunately, booting the old kernel doesn't seem to make the existing files accessible, either. I'm a bit worried and would love to help find the root cause and make sure I don't lose more data here.
@migrax when you say that rolling back "fixes it for new files", do you have a reliable way to reproduce this? I only found that this problem occurred with some files, but could not figure out which ones or why.
I had the same thing, basically at a certain package version the problem started happening. If you roll back to a kernel/package without the issue, existing files are still broken but it stops creating new broken files. That's my experience to.
From my naive attempt to read through the code, I think something is getting corrupted on disk that then causes the PANIC() when trying to read a file.. once that panic happens a lock is left held that stops other access to that and I suspect maybe some other unrelated files.. they maybe share some resource.. if you reboot sometimes some files that seemed broken are accessible again but the main problem file is still broken and once you try to access that file it seems to get stuck on a lock that then blocks access to other things. But I might be wrong about the blocking access to other things.
In the kernel trace you first see this PANIC(): VERIFY(0 == sa_handle_get_from_db(zfsvfs->z_os, db, zp, SA_HDL_SHARED, &zp->z_sa_hdl)) failed PANIC at zfs_znode.c:335:zfs_znode_sa_init()
And then some hung task reports later.
linux-image-5.8.0-29-generic: working linux-image-5.8.0-36-generic: broken
When the issue first hit, I had zfs-dkms installed, i removed that and went back to the version built with the kernel in Ubuntu and it's working OK. That version was 0.8.4-1ubuntu11 where as zfs-dkms was 0.8.4-1ubuntu16.
Problem has now repeated as the 5.8.0-36-generic kernel has now picked up 0.8.4-1ubuntu16.. ` lathiat@optane ~/src/zfs[zfs-2.0-release]$ sudo modinfo /lib/modules/5.8.0-29-generic/kernel/zfs/zfs.ko|grep version version: 0.8.4-1ubuntu11 srcversion: 75AFF98E9A918357B9D8C8D
lathiat@optane ~/src/zfs[zfs-2.0-release]$ sudo modinfo /lib/modules/5.8.0-36-generic/kernel/zfs/zfs.ko|grep version version: 0.8.4-1ubuntu16 srcversion: 8A5C7E4F91E160085378C8C `
I don't have a good quick/easy reproducer but just using my desktop for a day or two seems I am likely to hit the issue after a while.
I tried to install the upstream zfs-dkms package for 2.0 to see if I can bisect the issue on upstream versions but it breaks my boot for some reason I cannot quite figure out. I will continue to try and experiment and see if I can bisect which version broke it. Looking at the Ubuntu changelog I'd say the fix for https://bugs.launchpad.net/bugs/1899826 to backport the 5.9 and 5.10 compataibility patches is a prime suspect. I'll copy this info to the Ubuntu Launchpad bug and see if I can chase someone internally at Canonical to pick it up if I don't have enough time to continue the debug.
Side note: I sortof know what I'm doing in that I'm a Linux Software engineer, dabble in kernel stuff and I am a very long time deeply knowledable ZFS user at a user-space level but my code-level knowledge of ZFS is very basic so don't mistake any confidence for actually having real knowledge :)
I ran into this on the 0.8.4-1ubuntu16 packaged with the 5.8.0-36 kernel. I was able to use my zsys snapshots to get back to a good state from before I upgraded.
Side note: I sortof know what I'm doing in that I'm a Linux Software engineer, dabble in kernel stuff and I am a very long time deeply knowledable ZFS user at a user-space level but my code-level knowledge of ZFS is very basic so don't mistake any confidence for actually having real knowledge :)
Not too different here :). The significant changes came in 0.8.4-1ubuntu13.
zfs-2.0.1 is in hirsute-proposd so I am going to try that. Reasonable chance it will have fixed it since those patches are probably dropped.
Yeah, all those patches were dropped. Which means the issue is either fixed or upstream.
❯ git diff --summary applied/0.8.4-1ubuntu13 applied/2.0.1-1ubuntu1 debian/patches
delete mode 100644 debian/patches/4000-mount-encrypted-dataset-fix.patch
delete mode 100644 debian/patches/4520-Linux-5.8-compat-__vmalloc.patch
delete mode 100644 debian/patches/4521-enable-risc-v-isa.patch
delete mode 100644 debian/patches/4700-Fix-DKMS-build-on-arm64-with-PREEMPTION-and-BLK_CGRO.patch
create mode 100644 debian/patches/4701-enable-ARC-FILL-LOCKED-flag.patch
delete mode 100644 debian/patches/4710-Use-percpu_counter-for-obj_alloc-counter-of-Linux-ba.patch
delete mode 100644 debian/patches/4720-Linux-5.7-compat-Include-linux-sched.h-in-spl-sys-mu.patch
delete mode 100644 debian/patches/4800-Linux-5.9-compat-add-linux-blkdev.h-include.patch
delete mode 100644 debian/patches/4801-Linux-5.9-compat-NR_SLAB_RECLAIMABLE.patch
delete mode 100644 debian/patches/4802-Linux-5.9-compat-make_request_fn-replaced-with-submi.patch
delete mode 100644 debian/patches/4803-Increase-Supported-Linux-Kernel-to-5.9.patch
delete mode 100644 debian/patches/4804-Linux-5.10-compat-frame.h-renamed-objtool.h.patch
delete mode 100644 debian/patches/4805-Linux-5.10-compat-percpu_ref-added-data-member.patch
delete mode 100644 debian/patches/4806-Linux-5.10-compat-check_disk_change-removed.patch
delete mode 100644 debian/patches/4807-Linux-5.10-compat-revalidate_disk_size-added.patch
delete mode 100644 debian/patches/4808-Linux-5.10-compat-misc.patch
delete mode 100644 debian/patches/git_fix_dependency_loop_encryption1.patch
delete mode 100644 debian/patches/git_fix_dependency_loop_encryption2.patch
I have not run into this issue since 2.0.2.
Still running smoothly. I think this can be closed.
The issue appeared :(
2021 May 16 21:19:09 laptop VERIFY(0 == sa_handle_get_from_db(zfsvfs->z_os, db, zp, SA_HDL_SHARED, &zp->z_sa_hdl)) failed
2021 May 16 21:19:09 laptop PANIC at zfs_znode.c:339:zfs_znode_sa_init()
Linux laptop 5.8.0-45-generic
zfs-2.0.4
zfs-kmod-2.0.4
Is there anything I can do to provide more debug info needed for the fix?
me too on 2.0.2 on ubuntu kernel 5.13.0-12-generic
Same issue here. Skypeforlinux, MS Teams, VS Code, IntelliJ Idea, Firefox hangs are spotted.
user@user-laptop:~$ sudo modinfo /lib/modules/5.13.0-14-generic/kernel/zfs/zfs.ko filename: /lib/modules/5.13.0-14-generic/kernel/zfs/zfs.ko version: 2.0.3-8ubuntu6 srcversion: EEFC177471F615FA0A30B6B
Sample stack:
INFO: task skypeforlinux:5627 blocked for more than 362 seconds.
Tainted: P O 5.13.0-14-generic #14-Ubuntu
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:skypeforlinux state:D stack: 0 pid: 5627 ppid: 4583 flags:0x00004000
Call Trace:
__schedule+0x268/0x680
schedule+0x4f/0xc0
spl_panic+0xfa/0xfc [spl]
? queued_spin_unlock+0x9/0x10 [zfs]
? do_raw_spin_unlock+0x9/0x10 [zfs]
? __raw_spin_unlock+0x9/0x10 [zfs]
? dmu_buf_replace_user+0x65/0x80 [zfs]
? dmu_buf_set_user+0x13/0x20 [zfs]
? dmu_buf_set_user_ie+0x15/0x20 [zfs]
zfs_znode_sa_init+0xd9/0xe0 [zfs]
zfs_znode_alloc+0x101/0x560 [zfs]
? dmu_buf_unlock_parent+0x5d/0x90 [zfs]
? do_raw_spin_unlock+0x9/0x10 [zfs]
? dbuf_read_impl.constprop.0+0x316/0x3e0 [zfs]
? dbuf_rele_and_unlock+0x13b/0x4f0 [zfs]
? __cond_resched+0x1a/0x50
? __raw_callee_save___native_queued_spin_unlock+0x15/0x23
? queued_spin_unlock+0x9/0x10 [zfs]
? __cond_resched+0x1a/0x50
? down_read+0x13/0x90
? __raw_callee_save___native_queued_spin_unlock+0x15/0x23
? queued_spin_unlock+0x9/0x10 [zfs]
? do_raw_spin_unlock+0x9/0x10 [zfs]
? __raw_callee_save___native_queued_spin_unlock+0x15/0x23
? dmu_object_info_from_dnode+0x8e/0xa0 [zfs]
zfs_zget+0x237/0x280 [zfs]
zfs_dirent_lock+0x42a/0x570 [zfs]
zfs_dirlook+0x91/0x2a0 [zfs]
zfs_lookup+0x1fb/0x3f0 [zfs]
zpl_lookup+0xcb/0x230 [zfs]
? step_into+0xf1/0x260
__lookup_slow+0x84/0x150
walk_component+0x141/0x1b0
? path_init+0x2c1/0x3f0
path_lookupat+0x6e/0x1c0
? schedule+0x4f/0xc0
filename_lookup+0xbb/0x1c0
? __check_object_size.part.0+0x128/0x150
? __check_object_size+0x1c/0x20
? strncpy_from_user+0x44/0x150
user_path_at_empty+0x59/0x90
? make_kuid+0x13/0x20
do_faccessat+0x7f/0x1e0
__x64_sys_access+0x1d/0x20
do_syscall_64+0x61/0xb0
? do_syscall_64+0x6e/0xb0
? do_syscall_64+0x6e/0xb0
? exit_to_user_mode_prepare+0x95/0xb0
? syscall_exit_to_user_mode+0x27/0x50
? do_syscall_64+0x6e/0xb0
? do_syscall_64+0x6e/0xb0
? syscall_exit_to_user_mode+0x27/0x50
? __x64_sys_access+0x1d/0x20
? do_syscall_64+0x6e/0xb0
? do_syscall_64+0x6e/0xb0
? do_syscall_64+0x6e/0xb0
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f195f16983b
RSP: 002b:00007f19367f95e8 EFLAGS: 00000206 ORIG_RAX: 0000000000000015
RAX: ffffffffffffffda RBX: 000055aac1120ea8 RCX: 00007f195f16983b
RDX: 0000000000000001 RSI: 0000000000000000 RDI: 000055aac1184690
RBP: 00007f19367fb810 R08: 0000000000000000 R09: 00007f19367fb780
R10: 0000000000000000 R11: 0000000000000206 R12: 000055aac1120de8
R13: 00007f19367fc520 R14: 000055aac1120f50 R15: 000000000000000c
Happens here as well. I fear that this renders my computer unusable beyond a point. Kernel: Ubuntu 5.13.0, zfs 0.8.3-1ubuntu12.12
I believe I have tracked down the cause of this issue to be an Ubuntu-specific ZFS patch and have a reliable reproducer. Full details in https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1906476
I am not aware at this time of any good way to "fix" the issue on an existing dataset. For now I've just been moving the files into a "broken" directory and try not to access them.
Is there an actual fix for this yet that I can apply? I'm concerned about data corruption, and of course my system is pretty much unusable due to this. I read through the launchpad chatter but it's not clear exactly what I should do to fix this today :\
Should I be downgrading to zfsutils-linux=2.0.2-1ubuntu5.2 ?
Kernel: 5.13.0-7614-generic Module: zfs-kmod-2.0.3-8ubuntu6
If you're using the zfs-dkms package it's fixed in:
The kernel builds ZFS into a module at the time of the kernel release. New kernels are released on a regular 3 week cadence but one hasn't yet been released to incorporate this fix. So for now you can install zfs-dkms to build your own module from the updated source (assuming your zfs-dkms package is one of the above two versions). Within 3 weeks or so there should be an updated kernel incorporating the fix in the pre-built zfs module.
As best I can tell, it will only affect you if you have an encrypted dataset.
Hirsute's current kernel is 5.11.0-37 that does not have the fix. Hopefully -38 will. Impish's current kernel is 5.13.0-16 that does not have the fix either. Hopefully -17 will.
You can verify the zfs version included in your currently running kernel with "modinfo zfs".
@lathiat Thank you for this info. I couldn't use 2.0.2-1ubuntu5.2 since it seems to only support kernels up to 5.10. I went ahead and installed zfs 2.0.6 from source using DKMS, and it looks like I still can't remove files that were previously affected by this.. at least not without causing the same zfs_znode_sa_init()
panic.
Does this indeed mean that there is permanent corruption of affected files? I saw different opinions on this in the launchpad discussion.
Current zfs versions: zfs-2.0.6-1 zfs-kmod-2.0.6-1
Yeah for me there is permanent corruption I can't fix and that scrub doesn't find. I had to move all those files to an un-used directory.
Others are having the issue only on boot, I think basically what happens, is that the data is corrupted when loaded into the ARC and then that data may or may not get flushed back to disk. For some people it happens on boot and I think it never gets flushed to disk, because their whole / is encrypted, for me only /home is encrypted so the rest of the system keeps working and maybe that gives it an oppurtunity to end up back on disk.
I don't currently have a solution (other than just moving them out of the way into /home/broken) to get rid of the broken files.
looks like the fix has been uploaded to the proposed channel for Ubuntu 21.10
https://launchpad.net/ubuntu/+source/linux/5.13.0-20.20
After upgrading my system from Ubuntu 21.04 (openzfs 2.0.2, Linux 5.11) to 21.10 (openzfs 2.0.6, Linux 5.13.0-19) my system is also affected by this issue.
The fixed kernel is now released. Please upgrade your kernel to 5.13.0-20 and reboot. And try not to use ZFS with the Kernel at all
If you still get the errors after the new kernel it means the corruption got written to the FS and there is no known way to fix that currently. You have to figure out which files are broken and move them somewhere they won’t be accessed. Scrub does not identify it.
@lathiat when you say "try not to use ZFS with the Kernel at all", are you implying that it will always be safer to install and use the zfs-dkms package instead?
@lathiat when you say "try not to use ZFS with the Kernel at all", are you implying that it will always be safer to install and use the zfs-dkms package instead?
No I meant just don’t use the broken kernel release. As corrupt data can get committed to disk. With the latest kernel on Impish it’s all good no need for the DKMs package now.
Thanks for the response.
I understand that zpool scrub
isn't going to show any errors for this type of corruption. So, to check if any of the files in a given directory were corrupted, would it be sufficient to run sudo find . -exec stat {} +
and check if the command returns without getting stuck at any of the files?
This problem is caused by a patch that we don't have, ubuntu has released a fix for this, see https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1906476
If you hit this - upgrade to kernel 5.13.0-20 or later
Seems like there is a regression in kernel 5.17.5. I got this bug after upgrading to pop OS 22.04 and it wasn't a problem before on 20.04.
sec thing helps to be near!!.
I don't know what this means @ineo00048
System information
Describe the problem you're observing
A PANIC event is logged in dmesg
Describe how to reproduce the problem
Unsure
Include any warning/errors/backtraces from the system logs