openzfs / zfs

OpenZFS on Linux and FreeBSD
https://openzfs.github.io/openzfs-docs
Other
10.55k stars 1.74k forks source link

Many copy_file_range() calls finally hang with kernel oops #16436

Closed kimono-koans closed 2 months ago

kimono-koans commented 2 months ago

System information

Type Version/Name
Distribution Name Ubuntu
Distribution Version 24.04
Kernel Version 6.8.0-39-generic
Architecture amd64
OpenZFS Version zfs-2.2.2-0ubuntu9

Describe the problem you're observing

Irregularly copy_file_range call hangs ZFS with a kernel oops

Describe how to reproduce the problem

Call copy_file_range a bunch. See my code below. Of course could be a mistake I'm making.


    #[allow(unreachable_code, unused_variables)]
    fn copy_file_range(
        src_file_fd: BorrowedFd,
        dst_file_fd: BorrowedFd,
        len: usize,
    ) -> HttmResult<()> {
        #[cfg(any(target_os = "linux", target_os = "freebsd"))]
        {
            let mut amt_written = 0u64;

            // copy_file_range needs to be run in a loop as it is interruptible
            loop {
                match nix::fcntl::copy_file_range(
                    src_file_fd,
                    Some(&mut (amt_written as i64)),
                    dst_file_fd,
                    Some(&mut (amt_written as i64)),
                    len,
                ) {
                    // However, a return of zero  for  a  non-zero  len  argument
                    // indicates that the offset for infd is at or  beyond EOF.
                    Ok(bytes_written) if bytes_written == 0usize => return Ok(()),
                    Ok(bytes_written) => {
                        amt_written += bytes_written as u64;

                        if amt_written == len as u64 {
                            return Ok(());
                        }

                        if amt_written < len as u64 {
                            //  Upon successful completion, copy_file_range() will  return  the  number  of  bytes  copied
                            //  between files.  This could be less than the length originally requested.

                            //  Here continue until 0 is given
                            continue;
                        }

                        if amt_written > len as u64 {
                            return Err(
                                HttmError::new("Amount written larger than file len.").into()
                            );
                        }
                    }
                    Err(err) => match err {
                        nix::errno::Errno::ENOSYS => {
                            return Err(HttmError::new(
                                "Operating system does not support copy_file_ranges.",
                            )
                            .into())
                        }
                        _ => {
                            if GLOBAL_CONFIG.opt_debug {
                                eprintln!("DEBUG: copy_file_range call failed for the following reason: {}\nDEBUG: Falling back to default diff copy behavior.", err);
                            }
                        }
                    },
                }
            }
        }
        Err(HttmError::new("Operating system does not support copy_file_ranges.").into())
    }

Include any warning/errors/backtraces from the system logs

Aug 11 21:16:36 montrose kernel: rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
Aug 11 21:16:36 montrose kernel: rcu:   0-...0: (1 GPs behind) idle=3d54/1/0x4000000000000000 softirq=144194/144195 fqs=14962
Aug 11 21:16:36 montrose kernel: rcu:            hardirqs   softirqs   csw/system
Aug 11 21:16:36 montrose kernel: rcu:    number:        0          0            0
Aug 11 21:16:36 montrose kernel: rcu:   cputime:        0          0            0   ==> 30001(ms)
Aug 11 21:16:36 montrose kernel: rcu:   (detected by 3, t=60002 jiffies, g=233257, q=4951 ncpus=4)
Aug 11 21:16:36 montrose kernel: Sending NMI from CPU 3 to CPUs 0:
Aug 11 21:16:36 montrose kernel: NMI backtrace for cpu 0
Aug 11 21:16:36 montrose kernel: CPU: 0 PID: 2640 Comm: systemd Tainted: P      D    O       6.8.0-40-generic #40-Ubuntu
Aug 11 21:16:36 montrose kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./J4205-ITX, BIOS P1.80 05/21/2018
Aug 11 21:16:36 montrose kernel: RIP: 0010:native_queued_spin_lock_slowpath+0x83/0x300
Aug 11 21:16:36 montrose kernel: Code: 00 00 f0 0f ba 2b 08 0f 92 c2 8b 03 0f b6 d2 c1 e2 08 30 e4 09 d0 3d ff 00 00 00 77 61 85 c0 74 10 0f b6 03 84 c0 74 09 f3 90 <0f> b6 03 84 c0 75 f7 b8 01 00 00 00 66 89 03 5b 41 5c 41 5d 41 5e
Aug 11 21:16:36 montrose kernel: RSP: 0000:ffffaf9f0c857580 EFLAGS: 00000002
Aug 11 21:16:36 montrose kernel: RAX: 0000000000000001 RBX: ffffa0bdcadd54a8 RCX: 0000000000013000
Aug 11 21:16:36 montrose kernel: RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffa0bdcadd54a8
Aug 11 21:16:36 montrose kernel: RBP: ffffaf9f0c8575a8 R08: 0000000000020000 R09: ffffa0bdfba76c4c
Aug 11 21:16:36 montrose kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000013000
Aug 11 21:16:36 montrose kernel: R13: 0000000000000004 R14: ffffa0bdcadd5400 R15: ffffa0bdcbc5e800
Aug 11 21:16:36 montrose kernel: FS:  00007c536a83f400(0000) GS:ffffa0c130000000(0000) knlGS:0000000000000000
Aug 11 21:16:36 montrose kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Aug 11 21:16:36 montrose kernel: CR2: 0000610d110471f6 CR3: 000000014d316000 CR4: 00000000003506f0
Aug 11 21:16:36 montrose kernel: Call Trace:
Aug 11 21:16:36 montrose kernel:  <NMI>
Aug 11 21:16:36 montrose kernel:  ? show_regs+0x6d/0x80
Aug 11 21:16:36 montrose kernel:  ? nmi_cpu_backtrace+0xb5/0x120
Aug 11 21:16:36 montrose kernel:  ? nmi_cpu_backtrace_handler+0x11/0x20
Aug 11 21:16:36 montrose kernel:  ? nmi_handle+0x67/0x180
Aug 11 21:16:36 montrose kernel:  ? default_do_nmi+0x47/0x140
Aug 11 21:16:36 montrose kernel:  ? exc_nmi+0x1c2/0x290
Aug 11 21:16:36 montrose kernel:  ? end_repeat_nmi+0xf/0x60
Aug 11 21:16:36 montrose kernel:  ? native_queued_spin_lock_slowpath+0x83/0x300
Aug 11 21:16:36 montrose kernel: message repeated 2 times: [  ? native_queued_spin_lock_slowpath+0x83/0x300]
Aug 11 21:16:36 montrose kernel:  </NMI>
Aug 11 21:16:36 montrose kernel:  <TASK>
Aug 11 21:16:36 montrose kernel:  _raw_spin_lock+0x3f/0x60
Aug 11 21:16:36 montrose kernel:  spl_cache_refill+0x5b/0x1d0 [spl]
Aug 11 21:16:36 montrose kernel:  ? LZ4_uncompress_unknownOutputSize+0x467/0x940 [zfs]
Aug 11 21:16:36 montrose kernel:  spl_kmem_cache_alloc+0xf1/0x120 [spl]
Aug 11 21:16:36 montrose kernel:  zio_buf_alloc+0x39/0x90 [zfs]
Aug 11 21:16:36 montrose kernel:  abd_borrow_buf+0x38/0x50 [zfs]
Aug 11 21:16:36 montrose kernel:  abd_borrow_buf_copy+0x19/0x60 [zfs]
Aug 11 21:16:36 montrose kernel:  zio_decompress_data+0x35/0x90 [zfs]
Aug 11 21:16:36 montrose kernel:  arc_buf_fill+0x256/0x4d0 [zfs]
Aug 11 21:16:36 montrose kernel:  ? zio_data_buf_alloc+0x39/0x90 [zfs]
Aug 11 21:16:36 montrose kernel:  arc_buf_alloc_impl+0x1b3/0x220 [zfs]
Aug 11 21:16:36 montrose kernel:  arc_read+0x1410/0x1710 [zfs]
Aug 11 21:16:36 montrose kernel:  ? __pfx_dbuf_read_done+0x10/0x10 [zfs]
Aug 11 21:16:36 montrose kernel:  dbuf_read_impl.constprop.0+0x4e4/0x5d0 [zfs]
Aug 11 21:16:36 montrose kernel:  dbuf_read+0x23c/0x5b0 [zfs]
Aug 11 21:16:36 montrose kernel:  dmu_buf_hold_array_by_dnode+0x113/0x570 [zfs]
Aug 11 21:16:36 montrose kernel:  dmu_read_impl+0xb0/0x180 [zfs]
Aug 11 21:16:36 montrose kernel:  dmu_read+0x64/0xb0 [zfs]
Aug 11 21:16:36 montrose kernel:  zfs_fillpage+0x77/0x180 [zfs]
Aug 11 21:16:36 montrose kernel:  zfs_getpage+0x5d/0xa0 [zfs]
Aug 11 21:16:36 montrose kernel:  zpl_readpage_common+0x29/0x60 [zfs]
Aug 11 21:16:36 montrose kernel:  ? __pfx_zpl_read_folio+0x10/0x10 [zfs]
Aug 11 21:16:36 montrose kernel:  zpl_read_folio+0x11/0x20 [zfs]
Aug 11 21:16:36 montrose kernel:  filemap_read_folio+0x46/0xf0
Aug 11 21:16:36 montrose kernel:  filemap_fault+0x558/0x8e0
Aug 11 21:16:36 montrose kernel:  __do_fault+0x3b/0x140
Aug 11 21:16:36 montrose kernel:  do_read_fault+0x133/0x1d0
Aug 11 21:16:36 montrose kernel:  do_fault+0x109/0x350
Aug 11 21:16:36 montrose kernel:  handle_pte_fault+0x114/0x1d0
Aug 11 21:16:36 montrose kernel:  __handle_mm_fault+0x653/0x790
Aug 11 21:16:36 montrose kernel:  handle_mm_fault+0x18a/0x380
Aug 11 21:16:36 montrose kernel:  do_user_addr_fault+0x169/0x670
Aug 11 21:16:36 montrose kernel:  exc_page_fault+0x83/0x1b0
Aug 11 21:16:36 montrose kernel:  asm_exc_page_fault+0x27/0x30
Aug 11 21:16:36 montrose kernel: RIP: 0033:0x7c536b0e9d8e
Aug 11 21:16:36 montrose kernel: Code: b6 07 29 c8 c3 0f 1f 80 00 00 00 00 f3 0f 1e fa 89 f8 31 d2 66 0f ef ff 09 f0 25 ff 0f 00 00 3d c0 0f 00 00 0f 8f 74 02 00 00 <f3> 0f 6f 0f f3 0f 6f 06 66 0f 74 c1 66 0f da c1 66 0f ef c9 66 0f
Aug 11 21:16:36 montrose kernel: RSP: 002b:00007fffc59b31e8 EFLAGS: 00010283
Aug 11 21:16:36 montrose kernel: RAX: 00000000000001fe RBX: 0000610d110471f6 RCX: 00007fffc59b3290
Aug 11 21:16:36 montrose kernel: RDX: 0000000000000000 RSI: 00007c536b5671e8 RDI: 0000610d110471f6
Aug 11 21:16:36 montrose kernel: RBP: 00007fffc59b3280 R08: 0000000000000001 R09: 0000000000000000
Aug 11 21:16:36 montrose kernel: R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000000
Aug 11 21:16:36 montrose kernel: R13: 00007fffc59b3290 R14: 0000000000000095 R15: 0000610d11536be0
Aug 11 21:16:36 montrose kernel:  </TASK>
snajpa commented 2 months ago

Make sure you're using the latest stable release, this is one of the hot topics that has seen a lot of fixes.

kimono-koans commented 2 months ago

Make sure you're using the latest stable release, this is one of the hot topics that has seen a lot of fixes.

I'll refile if and when I'm able to try the latest.