Open tonyhutter opened 4 years ago
Using the debug logging from #10944, it appears pyzfs_unittest is causing the warning:
[11009.450658] ZTS running: /usr/share/zfs/zfs-tests/tests/functional/projectquota/projecttree_003_neg
[11009.775878] ZTS running: /usr/share/zfs/zfs-tests/tests/functional/projectquota/cleanup
[11010.554166] ZTS running: /usr/share/zfs/zfs-tests/tests/functional/pyzfs/pyzfs_unittest
[11055.328176] ------------[ cut here ]------------
[11055.334299] WARNING: CPU: 1 PID: 2737432 at fs/read_write.c:517 __kernel_write+0x139/0x160
[11055.345772] Modules linked in: loop zfs(POE) icp(POE) zzstd(POE) zlua(POE) zcommon(POE) zunicode(POE) znvpair(POE) zavl(POE) spl(OE) nfsd auth_rpcgss nfs_acl lockd grace sunrpc intel_rapl_msr intel_rapl_common nfit libnvdimm rapl ppdev ena parport_pc i2c_piix4 parport drm ip_tables crct10dif_pclmul crc32_pclmul crc32c_intel nvme ghash_clmulni_intel nvme_core serio_raw [last unloaded: scsi_debug]
[11055.384359] CPU: 1 PID: 2737432 Comm: python3 Tainted: P OE 5.8.9-200.fc32.x86_64 #1
[11055.396160] Hardware name: Amazon EC2 m5d.large/, BIOS 1.0 10/16/2017
[11055.403065] RIP: 0010:__kernel_write+0x139/0x160
[11055.409123] Code: 4c 89 c6 e8 b9 ce ff ff 49 89 c4 e9 56 ff ff ff 45 31 c9 45 31 c0 b9 01 00 00 00 4c 89 f2 89 ee 4c 89 ef e8 a9 b3 04 00 eb b2 <0f> 0b 49 c7 c4 f7 ff ff ff 5b 5d 4c 89 e0 41 5c 41 5d 41 5e c3 49
[11055.429055] RSP: 0018:ffff9fe6826c77c8 EFLAGS: 00010246
[11055.435396] RAX: 00000000080a801d RBX: ffff8cc2a1348000 RCX: ffff9fe6826c7828
[11055.442610] RDX: 0000000000000138 RSI: ffff8cc1bcefc000 RDI: ffff8cc2a1348000
[11055.449741] RBP: ffff8cc1bcefc000 R08: 0000000000000000 R09: ffff8cc1bcefc138
[11055.456946] R10: ffff9fe6826c7970 R11: 0000000000000000 R12: 0000000000000138
[11055.464095] R13: ffff9fe6826c7828 R14: 0000000000000000 R15: ffff9fe6826c7970
[11055.471300] FS: 00007f0782513740(0000) GS:ffff8cc2a5500000(0000) knlGS:0000000000000000
[11055.482649] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[11055.489237] CR2: 00007f078126b040 CR3: 00000001dc096004 CR4: 00000000007606e0
[11055.496383] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[11055.503572] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[11055.510786] PKRU: 55555554
[11055.516039] Call Trace:
[11055.521159] kernel_write+0x74/0x100
[11055.526831] zfs_file_write+0x26/0x70 [zfs]
[11055.532768] dump_bytes_cb+0x1b/0x20 [zfs]
[11055.538675] dump_bytes+0x1e/0x30 [zfs]
[11055.544433] dump_record+0x8e/0x200 [zfs]
[11055.550226] dmu_send_impl+0x3e4/0xbb0 [zfs]
[11055.556072] ? mutex_lock+0xe/0x30
[11055.561551] ? _cond_resched+0x16/0x40
[11055.567205] ? mutex_lock+0xe/0x30
[11055.572756] dmu_send+0x4b2/0x7e0 [zfs]
[11055.578469] ? jbd2_journal_dirty_metadata+0x1db/0x250
[11055.584757] ? ext4_mark_iloc_dirty+0x621/0xa20
[11055.590763] ? add_dirent_to_buf+0x11f/0x1f0
[11055.596635] ? __jbd2_journal_file_buffer+0x6f/0x1a0
[11055.602806] ? __find_get_block+0xb6/0x2f0
[11055.608657] ? __ext4_handle_dirty_metadata+0x51/0x1a0
[11055.614998] zfs_ioc_send_new+0x16e/0x190 [zfs]
[11055.621068] ? dump_bytes_cb+0x20/0x20 [zfs]
[11055.627051] zfsdev_ioctl_common+0x1f8/0x620 [zfs]
[11055.633282] zfsdev_ioctl+0x53/0xe0 [zfs]
[11055.639166] ksys_ioctl+0x82/0xc0
[11055.644703] __x64_sys_ioctl+0x16/0x20
[11055.650337] do_syscall_64+0x52/0x90
[11055.655886] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[11055.662054] RIP: 0033:0x7f0782ab83bb
[11055.667625] Code: 0f 1e fa 48 8b 05 dd aa 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ad aa 0c 00 f7 d8 64 89 01 48
[11055.687453] RSP: 002b:00007ffd7af35758 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[11055.698528] RAX: ffffffffffffffda RBX: 0000000000005a40 RCX: 00007f0782ab83bb
[11055.705686] RDX: 00007ffd7af35780 RSI: 0000000000005a40 RDI: 0000000000000003
[11055.713036] RBP: 00007ffd7af38d70 R08: 0000000000000000 R09: 000055927a259d90
[11055.720313] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
[11055.727499] R13: 00007ffd7af35780 R14: 0000000000005a40 R15: 000055927a27a4b0
[11055.734631] ---[ end trace 34f409ce0516ba0a ]---
[11061.250342] ZTS running: /usr/share/zfs/zfs-tests/tests/functional/quota/setup
[11061.598416] ZTS running: /usr/share/zfs/zfs-tests/tests/functional/quota/quota_001_pos
@aerusso mentioned in #10955 that this bug was also in 0.8.5-staging, and that the failure came from test_send_to_broken_pipe_2
.
It appears were doing a send to sleep 2
's stdin, when its stdin is not open for writing:
def test_send_to_broken_pipe_2(self):
snap = ZFSTest.pool.makeName(b"fs1@snap")
with zfs_mount(ZFSTest.pool.makeName(b"fs1")) as mntdir:
with tempfile.NamedTemporaryFile(dir=mntdir) as f:
for i in range(1024):
f.write(b'x' * 1024)
f.flush()
lzc.lzc_snapshot([snap])
if sys.version_info < (3, 0):
p = subprocess.Popen(['sleep', '2'], stdin=subprocess.PIPE)
with self.assertRaises(lzc_exc.StreamIOError) as ctx:
lzc.lzc_send(snap, None, p.stdin.fileno())
self.assertTrue(ctx.exception.errno == errno.EPIPE or
ctx.exception.errno == errno.EINTR)
else:
with subprocess.Popen(['sleep', '2'], stdin=subprocess.PIPE) as p:
with self.assertRaises(lzc_exc.StreamIOError) as ctx:
lzc.lzc_send(snap, None, p.stdin.fileno())
self.assertTrue(ctx.exception.errno == errno.EPIPE or
ctx.exception.errno == errno.EINTR)
We should check that our destination fd is open for writing before actually doing the send.
This issue has been automatically marked as "stale" because it has not had any activity for a while. It will be closed in 90 days if no further activity occurs. Thank you for your contributions.
Bumping because not stale and also something similar is cropping up on the GH Actions runners...
[10790.588184] ------------[ cut here ]------------
[10790.588191] WARNING: CPU: 0 PID: 1642667 at fs/read_write.c:542 __kernel_write+0x140/0x170
[10790.588192] Modules linked in: zfs(POE) zunicode(POE) zzstd(OE) zlua(OE) zavl(POE) icp(POE) zcommon(POE) znvpair(POE) spl(OE) nfsd auth_rpcgss nfs_acl lockd grace sunrpc xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype iptable_filter iptable_nat nf_nat br_netfilter bridge stp llc aufs overlay binfmt_misc nls_iso8859_1 dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua joydev crct10dif_pclmul crc32_pclmul ghash_clmulni_intel hid_generic xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c aesni_intel iptable_security xt_owner crypto_simd bpfilter cryptd glue_helper hid_hyperv serio_raw hyperv_fb hid hyperv_keyboard cfbfillrect cfbimgblt hv_netvsc hv_utils cfbcopyarea hv_balloon pata_acpi sch_fq_codel msr drm drm_panel_orientation_quirks i2c_core ip_tables x_tables autofs4
[10790.588227] CPU: 0 PID: 1642667 Comm: python3 Tainted: P OE 5.8.0-1042-azure #45~20.04.1-Ubuntu
[10790.588228] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090008 12/07/2018
[10790.588231] RIP: 0010:__kernel_write+0x140/0x170
[10790.588232] Code: 89 df e8 a3 f6 ff ff 49 89 c4 e9 52 ff ff ff 45 31 c9 45 31 c0 b9 01 00 00 00 4c 89 fa 44 89 ee 4c 89 f7 e8 32 1e 05 00 eb b1 <0f> 0b 49 c7 c4 f7 ff ff ff 5b 4c 89 e0 41 5c 41 5d 41 5e 41 5f 5d
[10790.588233] RSP: 0018:ffff9fe783c57730 EFLAGS: 00010246
[10790.588235] RAX: 00000000080a801d RBX: ffff89f1dc73ce00 RCX: ffff9fe783c577b0
[10790.588235] RDX: 0000000000000138 RSI: ffff89f1578f8e00 RDI: ffff89f1dc73ce00
[10790.588236] RBP: ffff9fe783c57758 R08: 0000000000000000 R09: 0000000000000000
[10790.588237] R10: ffff89f1578f8f38 R11: 0000000000000000 R12: 0000000000000138
[10790.588237] R13: ffff89f1578f8e00 R14: ffff9fe783c577b0 R15: 0000000000000000
[10790.588239] FS: 00007ffbc005f740(0000) GS:ffff89f2b8c00000(0000) knlGS:0000000000000000
[10790.588240] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[10790.588241] CR2: 00007ffbbfd496c8 CR3: 00000001c6a94006 CR4: 00000000001706f0
[10790.588244] Call Trace:
[10790.588282] kernel_write+0x69/0xc0
[10790.588374] zfs_file_write_impl+0xe/0x10 [zfs]
[10790.588442] zfs_file_write+0x3b/0x90 [zfs]
[10790.588531] dump_bytes_cb+0x1f/0x30 [zfs]
[10790.588598] dump_bytes+0x30/0x50 [zfs]
[10790.588641] dump_record+0x95/0x210 [zfs]
[10790.588684] dmu_send_impl+0x405/0xc80 [zfs]
[10790.588723] ? dbuf_rele+0x4f/0x80 [zfs]
[10790.588767] dmu_send+0x4ff/0x870 [zfs]
[10790.588771] ? __ext4_journal_get_write_access+0x7a/0x120
[10790.588778] ? _cond_resched+0x19/0x30
[10790.588781] ? __kmalloc_node+0x213/0x330
[10790.588789] ? i_get_value_size+0x1d/0x1c0 [znvpair]
[10790.588793] ? nvpair_value_common+0xdd/0x180 [znvpair]
[10790.588858] zfs_ioc_send_new+0x16d/0x1d0 [zfs]
[10790.588924] ? dump_bytes_cb+0x30/0x30 [zfs]
[10790.588992] zfsdev_ioctl_common+0x217/0x6c0 [zfs]
[10790.589061] zfsdev_ioctl+0x57/0xe0 [zfs]
[10790.589065] ksys_ioctl+0x9d/0xd0
[10790.589066] __x64_sys_ioctl+0x1a/0x20
[10790.589069] do_syscall_64+0x4c/0xd0
[10790.589072] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[10790.589074] RIP: 0033:0x7ffbc034250b
[10790.589076] Code: 0f 1e fa 48 8b 05 85 39 0d 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 55 39 0d 00 f7 d8 64 89 01 48
[10790.589077] RSP: 002b:00007fff3efcc078 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[10790.589078] RAX: ffffffffffffffda RBX: 0000000000005a40 RCX: 00007ffbc034250b
[10790.589079] RDX: 00007fff3efcc0a0 RSI: 0000000000005a40 RDI: 0000000000000003
[10790.589079] RBP: 00007fff3efcf690 R08: 0000000000000000 R09: 000000000110ef60
[10790.589080] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
[10790.589081] R13: 00007fff3efcc0a0 R14: 0000000000005a40 R15: 0000000001108b90
[10790.589083] ---[ end trace 09eedf53924b4280 ]---
It (or something very similar) is still turning up on Fedora 35, though not apparently breaking anything ATM:
[ 7422.103439] ------------[ cut here ]------------
[ 7422.103689] WARNING: CPU: 0 PID: 4080086 at fs/read_write.c:521 __kernel_write+0x278/0x2c0
[ 7422.103917] Modules linked in: loop rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 rfkill ip_set nf_tables nfnetlink qrtr sunrpc binfmt_misc zfs(POE) zunicode(POE) zzstd(OE) zlua(OE) zavl(POE) icp(POE) zcommon(POE) znvpair(POE) intel_rapl_msr spl(OE) intel_rapl_common joydev i2c_piix4 vboxguest fuse zram ip_tables xfs crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel vmwgfx drm_kms_helper cec ttm virtio_net drm ata_generic net_failover pata_acpi failover serio_raw video [last unloaded: scsi_debug]
[ 7422.105877] CPU: 0 PID: 4080086 Comm: python3.10 Kdump: loaded Tainted: P OE 5.16.12-200.fc35.x86_64 #1
[ 7422.106395] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[ 7422.106651] RIP: 0010:__kernel_write+0x278/0x2c0
[ 7422.106911] Code: 23 97 48 c7 c7 80 0f ff 97 e8 14 4d 38 00 85 c0 0f 85 c5 94 8e 00 48 83 c4 68 49 c7 c5 ea ff ff ff 4c 89 e8 5b 41 5c 41 5d c3 <0f> 0b 49 c7 c5 f7 ff ff ff 48 83 c4 68 4c 89 e8 5b 41 5c 41 5d c3
[ 7422.107740] RSP: 0018:ffffae38d0527748 EFLAGS: 00010246
[ 7422.108005] RAX: 00000000000a801d RBX: ffff9be29c88ea00 RCX: ffffae38d0527800
[ 7422.108315] RDX: 0000000000000138 RSI: ffff9be300687c00 RDI: ffff9be29c88ea00
[ 7422.108577] RBP: ffff9be300687c00 R08: 0000000000000138 R09: ffff9be36ae6e900
[ 7422.108835] R10: ffff9be333c36e40 R11: 0000000000000000 R12: ffff9be282d4e000
[ 7422.109102] R13: 0000000000000138 R14: ffffae38d0527800 R15: ffffae38d0527948
[ 7422.109362] FS: 00007ff6458c3740(0000) GS:ffff9be397c00000(0000) knlGS:0000000000000000
[ 7422.109625] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 7422.109886] CR2: 00007fd5691bb240 CR3: 0000000138d0e000 CR4: 00000000000506f0
[ 7422.110181] Call Trace:
[ 7422.110449] <TASK>
[ 7422.110708] ? inode_security+0x22/0x60
[ 7422.110973] ? selinux_file_permission+0x104/0x150
[ 7422.111248] ? security_file_permission+0x28/0x50
[ 7422.111518] kernel_write+0x62/0x140
[ 7422.111789] zfs_file_write+0x26/0x70 [zfs]
[ 7422.112190] dump_record+0x8e/0x180 [zfs]
[ 7422.112512] dmu_send_impl+0x67a/0x1430 [zfs]
[ 7422.112814] ? preempt_count_add+0x68/0xa0
[ 7422.113119] dmu_send+0x4ab/0x7d0 [zfs]
[ 7422.113414] ? __wake_up_common_lock+0x7a/0x90
[ 7422.113670] ? __kmalloc_node+0x170/0x390
[ 7422.113919] ? preempt_count_add+0x68/0xa0
[ 7422.114193] ? _raw_spin_lock_irqsave+0x26/0x50
[ 7422.114435] ? _raw_spin_unlock_irqrestore+0x25/0x40
[ 7422.114665] ? get_random_u32+0x5e/0xb0
[ 7422.114892] ? allocate_slab+0x32d/0x490
[ 7422.115119] zfs_ioc_send_new+0x16b/0x1c0 [zfs]
[ 7422.115376] ? zfs_ioc_send+0x2a0/0x2a0 [zfs]
[ 7422.115617] zfsdev_ioctl_common+0x3f2/0x950 [zfs]
[ 7422.115855] ? __check_object_size+0x56/0x150
[ 7422.116059] zfsdev_ioctl+0x53/0xe0 [zfs]
[ 7422.116288] __x64_sys_ioctl+0x82/0xb0
[ 7422.116473] do_syscall_64+0x3b/0x90
[ 7422.116658] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 7422.116832] RIP: 0033:0x7ff645aa931b
[ 7422.117077] Code: ff ff ff 85 c0 79 9b 49 c7 c4 ff ff ff ff 5b 5d 4c 89 e0 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d dd 2a 0f 00 f7 d8 64 89 01 48
[ 7422.117646] RSP: 002b:00007ffe94080c98 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ 7422.117840] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007ff645aa931b
[ 7422.118042] RDX: 00007ffe94080cc0 RSI: 0000000000005a40 RDI: 0000000000000003
[ 7422.118234] RBP: 00007ffe940842b0 R08: 0000000000000000 R09: 000055ece988d2a0
[ 7422.118421] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000005a40
[ 7422.118605] R13: 00007ffe94080cc0 R14: 0000000000005a40 R15: 000055ece988d2a0
[ 7422.118788] </TASK>
[ 7422.119007] ---[ end trace e9cb408b4554ff41 ]---
System information
Describe the problem you're observing
@behlendorf pointed me to a kernel warning we're seeing on the latest Fedora 32 builders: http://build.zfsonlinux.org/builders/Fedora%2032%20x86_64%20%28TEST%29/builds/1199/steps/shell_9/logs/console
He thinks it may be related to: https://github.com/torvalds/linux/commit/a01ac27be
Maybe we don't have
FMODE_WRITE
set for our ioctl?Describe how to reproduce the problem
Run the test suite and see problem in dmesg. It's possibly related to one of the failures: http://build.zfsonlinux.org/builders/Fedora%2032%20x86_64%20(TEST)/builds/1199/steps/shell_9/logs/summary
Include any warning/errors/backtraces from the system logs