Open Lorac opened 1 month ago
@snajpa it looks like you deleted your comment, but I will post my properties on my pool. Maybe it will help, looks like project_quota i active.
NAME PROPERTY VALUE SOURCE
a size 884G -
a capacity 85% -
a altroot - default
a health ONLINE -
a guid 8296607161116845554 -
a version - default
a bootfs - default
a delegation on default
a autoreplace off default
a cachefile - default
a failmode wait default
a listsnapshots on local
a autoexpand off default
a dedupratio 1.00x -
a free 130G -
a allocated 754G -
a readonly off -
a ashift 12 local
a comment - default
a expandsize - -
a freeing 0 -
a fragmentation 64% -
a leaked 0 -
a multihost off default
a checkpoint - -
a load_guid 14949411101328869406 -
a autotrim off default
a compatibility off default
a bcloneused 0 -
a bclonesaved 0 -
a bcloneratio 1.00x -
a feature@async_destroy enabled local
a feature@empty_bpobj active local
a feature@lz4_compress active local
a feature@multi_vdev_crash_dump enabled local
a feature@spacemap_histogram active local
a feature@enabled_txg active local
a feature@hole_birth active local
a feature@extensible_dataset active local
a feature@embedded_data active local
a feature@bookmarks enabled local
a feature@filesystem_limits enabled local
a feature@large_blocks enabled local
a feature@large_dnode enabled local
a feature@sha512 enabled local
a feature@skein enabled local
a feature@edonr enabled local
a feature@userobj_accounting active local
a feature@encryption enabled local
a feature@project_quota active local
a feature@device_removal enabled local
a feature@obsolete_counts enabled local
a feature@zpool_checkpoint enabled local
a feature@spacemap_v2 active local
a feature@allocation_classes enabled local
a feature@resilver_defer enabled local
a feature@bookmark_v2 enabled local
a feature@redaction_bookmarks enabled local
a feature@redacted_datasets enabled local
a feature@bookmark_written enabled local
a feature@log_spacemap active local
a feature@livelist enabled local
a feature@device_rebuild enabled local
a feature@zstd_compress active local
a feature@draid disabled local
a feature@zilsaxattr disabled local
a feature@head_errlog disabled local
a feature@blake3 disabled local
a feature@block_cloning disabled local
a feature@vdev_zaps_v2 disabled local
yeah sorry I was reading through the code and my fingers were faster than my brain, I thought I saw something related to project quota but I invalidated it after a bit :-D
I think this could fit what I'm trying to fix in #16625; but that fix won't help in this case, when there's a znode without SA initialized, but somehow still linked to a directory.
I tried figuring out a way how to call zpl_xattr_security_init
for the znode, but I'd really need the parent directory inode and I can't get to that in zfs_znode_sa_init
. Perhaps some way could be found by using thread specific data to mark what we're looking for in zpl_lookup
, if I'm right and if the file in question turns out to be really important for you, we can work on that later.
For now please make a snapshot of the affected dataset and then let's try the patch below, to see if it enables you to move forward without a panic - it should evict the inode and throw -EIO
on access to that file.
If I'm right, the file was created with O_TMPFILE and perhaps power was lost, or the kernel crashed in the wrong section, otherwise I can't explain how we end up with a znode without any SA at all.
The patch (should be applied to current master - 75dda92dc):
diff --git a/module/os/linux/zfs/zfs_znode_os.c b/module/os/linux/zfs/zfs_znode_os.c
index f13edf95b..025f2482a 100644
--- a/module/os/linux/zfs/zfs_znode_os.c
+++ b/module/os/linux/zfs/zfs_znode_os.c
@@ -323,7 +323,7 @@ zfs_cmpldev(uint64_t dev)
return (dev);
}
-static void
+static int
zfs_znode_sa_init(zfsvfs_t *zfsvfs, znode_t *zp,
dmu_buf_t *db, dmu_object_type_t obj_type, sa_handle_t *sa_hdl)
{
@@ -334,8 +334,11 @@ zfs_znode_sa_init(zfsvfs_t *zfsvfs, znode_t *zp,
ASSERT(zp->z_sa_hdl == NULL);
ASSERT(zp->z_acl_cached == NULL);
if (sa_hdl == NULL) {
- VERIFY(0 == sa_handle_get_from_db(zfsvfs->z_os, db, zp,
- SA_HDL_SHARED, &zp->z_sa_hdl));
+ if (0 != sa_handle_get_from_db(zfsvfs->z_os, db, zp,
+ SA_HDL_SHARED, &zp->z_sa_hdl)) {
+ zfs_dbgmsg("sa_handle_get_from_db failed");
+ return (1);
+ }
} else {
zp->z_sa_hdl = sa_hdl;
sa_set_userp(sa_hdl, zp);
@@ -344,6 +347,7 @@ zfs_znode_sa_init(zfsvfs_t *zfsvfs, znode_t *zp,
zp->z_is_sa = (obj_type == DMU_OT_SA) ? B_TRUE : B_FALSE;
mutex_exit(&zp->z_lock);
+ return (0);
}
void
@@ -538,7 +542,11 @@ zfs_znode_alloc(zfsvfs_t *zfsvfs, dmu_buf_t *db, int blksz,
zp->z_sync_writes_cnt = 0;
zp->z_async_writes_cnt = 0;
- zfs_znode_sa_init(zfsvfs, zp, db, obj_type, hdl);
+ int fail = zfs_znode_sa_init(zfsvfs, zp, db, obj_type, hdl);
+ if (fail) {
+ iput(ip);
+ return (SET_ERROR(EIO));
+ }
SA_ADD_BULK_ATTR(bulk, count, SA_ZPL_MODE(zfsvfs), NULL, &mode, 8);
SA_ADD_BULK_ATTR(bulk, count, SA_ZPL_GEN(zfsvfs), NULL, &tmp_gen, 8);
I initially tried deleting a corrupted file and directory, but I destroyed and recreated two ZFS datasets when that failed. This resolved my issue. The datasets weren’t critical (mostly build directories), but I needed them to continue working. The first time I encountered the error was from a directory which can't be a
At the time, I didn’t realize I should have taken a snapshot, assuming a corrupted file or directory wouldn’t be included in a snapshot. I am relatively new to ZFS.
The problem first appeared when deleting the directory: a specific file caused ZFS to panic. I opened a second SSH connection to my workstation and retried, but I encountered an EIO error instead of the panic. This resulted in a CPU hang on rm
and 15GB of logs.
At this point, I’m not sure if the patch will make a difference, as I’ve already recreated the datasets. I don't know if the files I was trying to delete were created with O_TMPFILE since they’ve been there for a while. They are part of a build system that typically doesn’t create temporary files for the type I was trying to delete. The second dataset could have used O_TMPFILE, as it was from a C++ build with -pipe—maybe.
The drive they’re on has 68,000 power-on hours (SSD). I need help analyzing the smartctl output; you likely have more expertise interpreting that data than I do.
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 100 100 000 Pre-fail Always - 1
5 Reallocate_NAND_Blk_Cnt 0x0032 100 100 010 Old_age Always - 0
9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 68383
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 198
171 Program_Fail_Count 0x0032 100 100 000 Old_age Always - 0
172 Erase_Fail_Count 0x0032 100 100 000 Old_age Always - 0
173 Ave_Block-Erase_Count 0x0032 018 018 000 Old_age Always - 1642
174 Unexpect_Power_Loss_Ct 0x0032 100 100 000 Old_age Always - 100
180 Unused_Reserve_NAND_Blk 0x0033 000 000 000 Pre-fail Always - 5589
183 SATA_Interfac_Downshift 0x0032 100 100 000 Old_age Always - 0
184 Error_Correction_Count 0x0032 100 100 000 Old_age Always - 0
187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0
194 Temperature_Celsius 0x0022 076 048 000 Old_age Always - 24 (Min/Max 17/52)
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0
197 Current_Pending_ECC_Cnt 0x0032 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0030 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x0032 100 100 000 Old_age Always - 1
202 Percent_Lifetime_Remain 0x0030 018 018 001 Old_age Offline - 82
206 Write_Error_Rate 0x000e 100 100 000 Old_age Always - 0
210 Success_RAIN_Recov_Cnt 0x0032 100 100 000 Old_age Always - 0
246 Total_LBAs_Written 0x0032 100 100 000 Old_age Always - 278295250114
247 Host_Program_Page_Count 0x0032 100 100 000 Old_age Always - 8696884170
248 FTL_Program_Page_Count 0x0032 100 100 000 Old_age Always - 46386914208
I actually got the panic again on another more critical dataset. It's already in my snapshots, so I'll try the patch.
This happened during a scan with ncdu
Curious, is your xattr
property at the affected dataset =sa
or =on
?
xattr
is on
on the affected dataset.
@snajpa so I had bad memory stick, swap to new sticks and now no more problems. So it's not a ZFS bug per say I guess?
if you can't reproduce after a HW change (and no SW upgrade), then it must have been it :)
System information
Describe the problem you're observing
I am experiencing kernel panic when navigating to a ZFS dataset or copying data to it. The issue persists even after upgrading. Initially, I was using ZFS version 2.1 with kernel 6.1, and the problem occurred. I upgraded to the latest versions available to both ZFS and the kernel bookworm backports, but the issue remains.
Describe how to reproduce the problem
The kernel panic started when I used the "z" plugin in oh-my-zsh to navigate directories within a ZFS dataset. The panic also occurs when trying to rsync a directory from an ext4 filesystem to a ZFS dataset.
Include any warning/errors/backtraces from the system logs