Closed ismell closed 2 years ago
I see I should have been more precise.
I have two identical copies of a Gentoo root filesystem running on identical VMs on an identical host, with identical kernels. With Gentoo kernel booting an Ubuntu 18.04 userland and then chrooted into one copy, it does not reproduce, ever. With Gentoo kernel booting into a Gentoo userland and then chrooted into the other copy, it does.
I'm hitting a different issue since running with the patch committed to master on top of 2.1.1 (note that this didn't seem to happen with https://github.com/openzfs/zfs/commit/93e19c7b35b62c238839536fed0d0f23c0466136 when I was running it in the past, but I've not (yet?) tried downgrading to it):
dracut: *** Installing kernel module dependencies ***
depmod: ERROR: failed to load symbols from /var/tmp/portage/sys-fs/zfs-2.1.1-r3/temp/dracut.1BldCZ/initramfs/lib/modules/5.14.17-adry/extra/unicode/zunicode.ko: Exec format error
depmod: ERROR: failed to load symbols from /var/tmp/portage/sys-fs/zfs-2.1.1-r3/temp/dracut.1BldCZ/initramfs/lib/modules/5.14.17-adry/extra/nvpair/znvpair.ko: Exec format error
depmod: ERROR: failed to load symbols from /var/tmp/portage/sys-fs/zfs-2.1.1-r3/temp/dracut.1BldCZ/initramfs/lib/modules/5.14.17-adry/extra/spl/spl.ko: Exec format error
depmod: ERROR: failed to load symbols from /var/tmp/portage/sys-fs/zfs-2.1.1-r3/temp/dracut.1BldCZ/initramfs/lib/modules/5.14.17-adry/extra/zcommon/zcommon.ko: Exec format error
depmod: ERROR: failed to load symbols from /var/tmp/portage/sys-fs/zfs-2.1.1-r3/temp/dracut.1BldCZ/initramfs/lib/modules/5.14.17-adry/extra/zfs/zfs.ko: Exec format error
dracut: *** Installing kernel module dependencies done ***
dracut: *** Resolving executable dependencies ***
The same messages occur for other kernel modules, like libata.ko.
Running file
on a bad module yields:
# file /lib/modules/5.14.17-adry/kernel/drivers/ata/libata.ko
/lib/modules/5.14.17-adry/kernel/drivers/ata/libata.ko: data
Running file
on a good module (from a previous kernel build) yields:
# file /lib/modules/5.14.16-adry/kernel/drivers/ata/libata.ko
/lib/modules/5.14.16-adry/kernel/drivers/ata/libata.ko: ELF 64-bit LSB relocatable, x86-64, version 1 (SYSV), BuildID[sha1]=34d7fc7363ebace6b3bd5f1c6d124b8909da6806, not stripped
If I use a tmpfs to build the kernel and ZFS, these errors do not occur. This is all with coreutils 9 too. I'll try to reproduce with no-tmpfs + older coreutils now.
EDIT: If I downgrade coreutils, the modules aren't corrupted. EDIT: This seems to happen reliably (read: every time so far) with coreutils-9.0 and not at all with coreutils-8.32.
The modules are broken during the process of copying from the 'work dir' (where building is done) to the image. Same issue as with Go, IIRC.
# file /var/tmp/portage/sys-fs/zfs-kmod-2.1.1-r1/work/zfs-2.1.1/module/zstd/zzstd.ko
/var/tmp/portage/sys-fs/zfs-kmod-2.1.1-r1/work/zfs-2.1.1/module/zstd/zzstd.ko: ELF 64-bit LSB relocatable, x86-64, version 1 (SYSV), BuildID[sha1]=1801c62400d21bcaea7409e1eca7126133252bea, not stripped
# file /var/tmp/portage/sys-fs/zfs-kmod-2.1.1-r1/image/lib/modules/5.14.17-adry/extra/zstd/zzstd.ko
/var/tmp/portage/sys-fs/zfs-kmod-2.1.1-r1/image/lib/modules/5.14.17-adry/extra/zstd/zzstd.ko: data
EDIT: Summary so far:
zfs_dmu_offset_next_sync=1
. I was suspecting it couldn't even occur with zfs_dmu_offset_next_sync=0
but finally building the kernel again, it failed once.file
is not a completely reliable test as it may report a file is ELF even if it then goes on to segfault (so corruption isn't always in the same region?)I should've tried the PR before it was merged, but the PR didn't seem to apply (when I tried) to 2.1.1 and I was happy with the commit I mentioned given it seemed to be working for me. Apologies.
Time to wake up the experiment VM again...
(Maybe #12710 would be the safe option for shipping 2.1.2/2.0.7 after all?)
One curious thing I just had happen.
Running 2.1.1 with that patch cherrypicked, I had it happen that I did an emerge of smartmontools, asked for file [tmp path to sbin]; du [tmp path to sbin]
and got back:
/var/portage_tmp/portage/sys-apps/smartmontools-7.2-r1/image/usr/sbin/smartctl: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 3.2.0, stripped
/var/portage_tmp/portage/sys-apps/smartmontools-7.2-r1/image/usr/sbin/smartd: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 3.2.0, stripped
/var/portage_tmp/portage/sys-apps/smartmontools-7.2-r1/image/usr/sbin/update-smart-drivedb: POSIX shell script, ASCII text executable
1 /var/portage_tmp/portage/sys-apps/smartmontools-7.2-r1/image/usr/sbin/smartctl
1 /var/portage_tmp/portage/sys-apps/smartmontools-7.2-r1/image/usr/sbin/smartd
!?
So I ran du again and got:
459 /var/portage_tmp/portage/sys-apps/smartmontools-7.2-r1/image/usr/sbin/smartctl
So there's at least a race with reported size if you just cherrypick that patch to 2.1.1, I guess...
(This could be pre-existing, I just noticed it and went "...that's very weird.")
Yup, reproduced permanently differently mangled files great with zfs_dmu_offset_next_sync=1 (though not 0, so far, so that's something) and ebuild zfs-kmod-2.1.1.ebuild clean install with portage tmpdir on ZFS.
The new files are entirely empty:
# ~rich/zfs_cp/cmd/zdb/zdb -dbdbdbdbdbdb testpool/portage_tmp 112048
Dataset testpool/portage_tmp [ZPL], ID 259, cr_txg 12, 981M, 149884 objects, rootbp DVA[0]=<0:341076600:200> DVA[1]=<0:380095400:200> [L0 DMU objset] fletcher4 lz4 unencrypted LE contiguous unique double size=1000L/200P birth=64564L/64564P fill=149884 cksum=1ae04a529f:73bebec3954:132f6162a5765:26758bb97eb8b2
Object lvl iblk dblk dsize dnsize lsize %full type
112048 1 128K 128K 0 512 128K 0.00 ZFS plain file (K=inherit) (Z=inherit=lz4)
176 bonus System attributes
dnode flags: USERUSED_ACCOUNTED USEROBJUSED_ACCOUNTED
dnode maxblkid: 0
path /portage/sys-fs/zfs-kmod-2.1.1/image/lib/modules/5.10.74-gentoo/extra/lua/zlua.ko
uid 0
gid 0
atime Mon Nov 8 04:47:42 2021
mtime Mon Nov 8 04:47:41 2021
ctime Mon Nov 8 04:47:41 2021
crtime Mon Nov 8 04:47:41 2021
gen 64561
mode 100644
size 320496
parent 111907
links 1
pflags 840800000004
Indirect blocks:
#
(and stay that way even across a pool export+import, this isn't just "zdb is catching state not being flushed yet".)
My guess is whatever is resulting in short sizes reported also results in seeing an empty length when trying to decide how much to copy, but that's just a guess.
@thesamesam reported (IIRC) that he couldn't reproduce this before 93c19c7 in limited testing, with either setting, so this seems like a worse outcome...
I'm...going to test it after I've exhaustively convinced myself this doesn't reproduce with =0, but unless that's broken in novel ways I'm not expecting, I'm pretty strongly going to suggest reverting 93c19c7 and pulling in #12710 until this gets run down more.
Gonna nominate this for the top of the bug blitz on Tuesday unless someone knocks it out first. :D
One possible explanation for this would be the slight change to how we're checking for a dirty dnode. Could someone who is already setup to test this try the following small change (on top of the previous fix). It effectively reverts the dirty check to the previous logic while preserving the rest of the mmap fix.
diff --git a/module/zfs/dnode.c b/module/zfs/dnode.c
index 6f87f49f8..db1a5d71d 100644
--- a/module/zfs/dnode.c
+++ b/module/zfs/dnode.c
@@ -1657,7 +1657,7 @@ dnode_is_dirty(dnode_t *dn)
mutex_enter(&dn->dn_mtx);
for (int i = 0; i < TXG_SIZE; i++) {
- if (list_head(&dn->dn_dirty_records[i]) != NULL) {
+ if (multilist_link_active(&dn->dn_dirty_link[i])) {
mutex_exit(&dn->dn_mtx);
return (B_TRUE);
}
@behlendorf I haven't been able to reproduce the issue with this new patch so far. Previously it would reliably happen within 3-4 attempts with one case and every attempt in the other cases, so it looks promising. I'll keep running with it.
Well, my spreadsheet says "2/10 builds of smartmontools with dmu_sync=1 and current master corrupted (and 10/10 for zfs-kmod)" and 0/10 and 0/10 with, so that seems promising...
(Space is still wrong for all the zfs-kmod runs and none of the smartmontools runs, so that's unfortunate, but for another day.)
edit to add: I think I'd advocate defaulting to dmu_offset_next_sync=1
for the tunable, not 0, since "this feature works but it's stubbed out silently by default" will rapidly become "this feature doesn't work", IMO...
Thanks for the quick testing, do I understand correctly then that with the patch in #12745 you're no longer able to reproduce the issue?
@rincebrain I tend to agree about the default. This was originally done to minimize the performance impact when checking for holes, but I think in practice it ended up being too confusing. I've gone ahead and opened #12746 to change the default.
I can absolutely not reproduce it any more with the patch in #12745 on any case I've tested.
Same here -- no issues with master + #12745. No opinion on the default although it may be worth revisiting it once things have calmed down.
For sure we need tests to cover ongoing scenarios as they are hard to reproduce in live systems
Bit of an out there idea: would be interesting to see if we could get a CI box fast enough to try doing emerges (in Gentoo) that could try hit such problems.
Of course, a more isolated test case would be great, even if it were just calling cp in a loop?
It would be very interesting to work out what happens actually in case of emerge an build test case around that not only guestimating what could be wrong in code and fix every possible slip probably introducing regression.
It would be very interesting to work out what happens actually in case of emerge an build test case around that not only guestimating what could be wrong in code and fix every possible slip probably introducing regression.
I presume you mean something much more constrained than what I'm taking from your comment, otherwise you seem to be proposing solving a somewhat general open AI problem in order to avoid this regression.
I mean if we know where we have bug in code we won't fix other parts that otherwise might look like as source of bug (of course we could find bugs in other places but they are irrelevant to searched bug) and could therefore introduce regression if we are wrong.
As a bonus we have real world test case just to implement not guestimating what test case to this bug could look like
This can probably be closed now, unless someone's reproduced it since the fixes?
System information
Describe the problem you're observing
lseek
+SEEK_DATA
is randomly returning-ENXIO
.Code showing the problem: https://source.chromium.org/chromiumos/chromiumos/codesearch/+/main:src/third_party/portage_tool/src/portage_util_file_copy_reflink_linux.c;l=145
I patched it with the following:
On a successful run this is what the strace looks like(depthcharge.elf):
When a failure happens this is what it looks like (netboot.elf):
man lseek
shows:Describe how to reproduce the problem
I don't have a simple way to reproduce this yet. It happens as part of a
portage
build. I have attached a full strace log. The problem shows up randomly, so running the same command multiple times will cause different files to fail to copy.Include any warning/errors/backtraces from the system logs
depthcharge-strace-patch1.log